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CHAPTER  1: 
Introduction 


1.1  Introduction 

In  radio  communications,  the  frequencies  between  3  megahertz  (MHz)  and  30  MHz  are 
commonly  referred  to  as  the  high  frequency  (HF)  band.  The  main  advantage  of  using  this 
frequency  band  for  communication  is  that  it  allows  global  coverage  without  any  infrastruc¬ 
ture.  This  is  due  to  those  frequencies’  ability  to  reflect  off  the  ionosphere.  Therefore,  a 
notable  characteristic  of  the  HF  band  is  that  propagation  conditions  change  continuously 
with  changes  in  the  ionosphere.  The  properties  of  the  ionosphere  are  heavily  dependent 
on  a  number  of  factors,  including  time  of  day  and  year,  geographic  location,  and  the  sun’s 
1 1-year  cycle.  When  establishing  a  link,  i.e.,  calling  another  radio  station  in  order  to  transfer 
information,  all  these  factors  and  more  must  be  taken  into  account  when  selecting  transmis¬ 
sion  parameters  such  as  frequency  and  power  [1].  To  achieve  reliable  communications  on 
the  HF  radio  bands,  skilled  and  experienced  operators  are  therefore  normally  needed. 

In  the  past  few  decades,  advances  in  automatic  link  establishment  (ALE)  technology  have 
allowed  relatively  unskilled  operators  to  operate  HF  radios  and  establish  communication 
links  with  success  rates  and  times  close  to  those  of  skilled  and  experienced  operators.  The 
addition  of  automation  to  any  system  inevitably  introduces  both  new  security  issues  as  well 
as  new  variants  of  previous  issues.  The  second-generation  (2G)  and  third-generation  (3G) 
ALE  standards  address  this  by  including  an  option  for  encrypting  the  link  establishment 
messages  that  are  sent  over  the  air  [1]. 


1.2  Purpose  and  Motivation 

The  purpose  of  this  thesis  is  to  study  the  security  of  the  SoDark  family  of  ciphers  that  are 
used  to  encrypt  link  establishment  messages  in  the  2G  and  3G  ALE  standards.  The  security 
provided  by  the  ciphers  directly  affects  the  performance  of  ALE  systems  in  the  presence  of 
adversarial  electronic  warfare  measures,  which  makes  knowledge  of  their  security  bounds 
important. 
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The  SoDark  algorithms  have  been  developed  specifically  for  the  ALE  application  [1],  No 
public  cryptanalysis  of  the  algorithms  is  available,  so  their  security  is,  in  effect,  unknown. 
Both  2G  and  3G  ALE  systems  are  in  active  use  worldwide  by  users  ranging  from  government 
and  military,  to  non-governmental  organizations  and  amateur  radio  operators  [2].  If  cryp¬ 
tographic  weaknesses  exist  in  the  ciphers  protecting  these  users’  ALE  HF  communications, 
knowledge  of  those  weaknesses  might  help  the  users  compensate  for  those  weaknesses  and, 
eventually,  eliminate  them. 

1.3  Methodology 

The  bounds  of  security  of  ciphers  are  established  through  cryptanalysis,  described  in  Chap¬ 
ter  2.  For  academic  purposes,  any  weakness  in  a  cryptographic  system  is  enough  for  it  to 
be  considered  broken.  This  includes  attacks  that  are  infeasible  in  practice  or  only  possible 
under  very  special  circumstances.  A  cipher  is  considered  broken  in  practice  if  an  attack  that 
affects  the  security  provided  by  the  cipher  can  be  performed  in  some  real-life  setting  [3]. 

As  such,  the  method  employed  in  academic  cryptanalysis  is  that  of  hypothesis  testing.  The 
null  hypothesis,  then,  is  that  the  cipher  is  secure  and  that  the  most  efficient  way  to  attack  it 
is  through  an  exhaustive  key  search  (see  Chapter  2).  An  attack  on  the  cipher  that  requires 
less  effort  than  this  constitutes  a  falsification  of  the  null  hypothesis. 


1.4  Thesis  Outline 

Chapter  2  provides  a  brief  theoretical  background  on  a  number  of  concepts  in  cryptography 
and  information  security  that  are  central  to  the  material  covered  in  the  rest  of  the  thesis. 
The  chapter  also  includes  a  brief  overview  of  ALE  technology. 

Chapter  3  contains  a  description  of  the  SoDark  family  of  ciphers,  mainly  based  on  the 
specifications  in  [1],  [4],  and  [5].  It  also  introduces  the  mathematical  notation  used  in 
the  cryptanalysis  of  the  ciphers.  The  chapter  also  investigates  the  properties  and  selection 
criteria  of  the  SoDark  S-box  and  generalizes  the  cipher’s  structure  to  the  Even-Mansour 
(EM)  construction.  A  brief  investigation  of  the  cipher’s  properties  with  regard  to  linear 
cryptanalysis  is  also  performed. 
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Chapter  4  contains  the  main  contributions  of  the  thesis:  differential-based  structural  key 
recovery  attacks  on  up  to  eight  rounds  of  the  24-bit  SoDark-3  algorithm. 

Chapter  5  describes  the  process  of  generating  efficient  logic  circuit  representations  of  the 
SoDark  S-box.  The  logic  circuit  representations  are  used  in  the  attacks  presented  in 
Chapters  6  and  7. 

Chapter  6  describes  the  conversion  of  the  logic  circuit  representations  from  Chapter  5  into 
conjunctive  normal  form  (CNF)  and  the  use  of  Boolean  satisfiability  problem  (SAT)  solvers 
for  key  recovery  attacks  on  up  to  four  rounds  of  SoDark-3. 

Chapter  7  describes  the  development  of  a  high-performance  bitslicing  CUDA  implemen¬ 
tation  for  brute  force  key  recovery  attacks  on  the  full  cipher.  Conversion  of  the  developed 
known-plaintext  attack  into  a  ciphertext-only  attack  is  described. 

Chapter  8  concludes  the  thesis  with  a  summary  of  the  main  results.  It  investigates  the  con¬ 
sequences  of  the  results  on  the  ALE  system  and  provides  recommendations.  A  replacement 
cipher,  based  on  best  practices,  is  suggested.  The  chapter  finishes  with  a  brief  description 
of  possible  areas  of  study  for  further  cryptanalysis  of  the  SoDark  cipher  family. 


3 


THIS  PAGE  INTENTIONALLY  LEFT  BLANK 


4 


CHAPTER  2: 
Background 


2.1  Information  Security 

Information  security  is  a  term  used  for  the  practices  concerning  the  protection  of  information, 
regardless  of  its  physical  form.  The  notion  of  protection  is  primarily  expressed  in  three 
core  concepts:  confidentiality,  integrity,  and  availability.  Other  concepts  such  as  non¬ 
repudiation,  accountability,  reliability,  or  variants  thereof  are  sometimes  included  as  further 
core  concepts.  Here,  however,  focus  is  on  the  three  primary  concepts,  which  are  defined  as 

-  Confidentiality  is  the  protection  of  the  information  content  itself  so  that  only  those 
authorized  are  able  to  access  and  use  it. 

-  Integrity  is  the  protection  of  information  against  unauthorized  change  as  well  as  the 
ability  to  detect  unauthorized  changes  that  have  been  made. 

-  Availability  is  the  protection  of  the  ability  to  access  the  information  so  that  it  is 
available  for  authorized  users  to  read  or  modify.  As  such,  the  protection  is  against 
physical  loss  of  the  information  itself  as  well  as  against  loss  of  the  ability  to  access  or 
transfer  it. 

Methods  for  achieving  the  three  aforementioned  conditions  vary  depending  on  the  conse¬ 
quences  of  failure  to  protect  information  as  well  as  the  information’s  physical  form.  They  can 
include  legislation,  physical  obstacles,  backups,  spare  systems,  training,  and  authentication 
mechanisms  as  well  as  mathematical  and  computer  algorithms  such  as  cryptosystems  [6]. 

2.2  Block  Ciphers 

Block  ciphers  are  prevalent  as  fundamental  building  blocks  of  other  algorithms  or  protocols 
that  aim  to  provide  confidentiality,  integrity,  or  availability  in  digital  systems.  In  that  regard, 
they  are  known  as  cryptographic  primitives.  Their  basic  purpose  is  to  provide  a  means  of 
transforming  messages  between  plaintext  space  and  ciphertext  space  using  a  secret  key. 
To  do  this,  a  block  cipher  specifies  an  encryption  function  E  and  a  decryption  function 
D  =  E~l  that  take  the  key  and  message  as  parameters.  In  other  words:  C  =  Ek(P )  and 
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P  =  Dk(C).  Note,  that  for  this  to  work,  Ek  must  be  bijective  so  that  its  inverse  Dk 
exists  and  Dk{Ek{P ))  =  P  V  K,  P.  The  relationship  between  E,  D,  P,  C,  and  K  is  shown 
schematically  in  Figure  2.1. 


K 


K 


Figure  2.1.  A  generic  block  cipher  with  encryption  function  E  and  decryption 
function  D. 


In  digital  block  cipher  systems,  the  sets  of  plaintexts  and  ciphertexts  consist  of  all  binary 
strings  of  a  certain  length  n:  P,  C  e  {0, 1}".  This  length  is  known  as  the  block  size.  The 
key  is  also  a  binary  string  of  fixed  length:  K  e  {0,  I } k .  but  there  is  no  requirement  for  the 
key  size  k  to  be  the  same  as  the  block  size  n.  Nevertheless,  this  is  the  case  for  some  ciphers 
such  as  the  Advanced  Encryption  Standard  (AES),  when  it  is  used  with  a  128-bit  key  [7]. 

A  block  cipher  provides  security  by  making  it  computationally  infeasible  to  discover  the 
plaintexts  of  any  number  of  given  ciphertexts,  or  discover  the  key  used  to  generate  them. 
The  inverse,  calculating  the  ciphertexts  of  any  number  of  given  plaintexts,  should  also  be 
infeasible.  This  is  perhaps  the  most  important  requirement  for  any  cipher — that  the  security 
must  rely  only  on  the  key.  In  other  words,  knowledge  of  the  cipher  algorithm  or  any  number 
of  plaintexts  or  ciphertexts  should  not  allow  an  attacker  to  gain  any  more  information 
about  the  key  or  unknown  plaintexts  or  unknown  ciphertexts.  This  principle  is  known  as 
Kerckhoffs’  principle  [8]  and  is  a  fundamental  requirement  in  all  modern  cryptography. 

The  properties  that  ciphers  must  have  in  order  to  be  secure  are  described  in  Shannon’s 
seminal  paper  [9].  In  particular,  he  introduces  the  two  principles  of  diffusion  and  confusion 
that  are  used  to  prevent  statistical  analysis.  Diffusion  means  that  any  properties  of  parts 
of  the  plaintext  should  be  spread  out  over  as  much  of  the  ciphertext  as  possible.  In  block 
ciphers,  this  means  that  the  avalanche  effect  is  a  desirable  property.  That  is,  the  change 
in  a  single  bit  of  the  plaintext  should  cause  the  probability  of  change  for  any  given  bit 
of  ciphertext  to  be  \  [7].  While  demonstration  of  the  avalanche  effect  shows  a  cipher 
has  diffusion,  it  is  not  enough  to  prove  any  level  of  security.  For  this,  confusion  is  also 
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necessary,  which  is  in  essence  the  same  statement  for  the  key — no  simple  statistical  relation 
between  the  key  and  ciphertext  should  exist.  Both  diffusion  and  confusion  are  necessary 
for  a  cipher  to  be  secure.  A  cipher  without  one  or  the  other  is  likely  to  be  vulnerable  to 
statistical  attacks. 

The  fact  that  the  security  of  a  block  cipher  should  only  be  dependent  on  the  key  makes  the 
size  of  the  key  space  important.  If  the  key  space  is  too  small,  an  adversary  that  has  access  to 
a  small  number  of  ciphertexts  and  their  corresponding  plaintexts  can  simply  perform  trial 
decryption  with  all  possible  keys  until  the  correct  one  is  found.  Note  that,  by  the  pigeonhole 
principle,  if  the  key  size  is  larger  than  the  block  size,  then,  for  any  plaintext,  there  exists  at 
least  one  ciphertext  that  is  generated  by  more  than  one  key  and  vice  versa. 

As  mentioned  previously,  block  ciphers  are  usually  not  used  to  directly  encrypt  messages 
in  block-sized  chunks.  Instead,  they  are  used  as  cryptographic  primitives  in  block  cipher 
modes  of  operation.  These  modes  prevent  certain  security  issues  associated  with  using 
block  ciphers  directly,  enable  encryption  of  variable-length  messages,  and  provide  other 
desirable  properties  such  as  authentication  [7] .  While  modes  of  operation  are  very  important 
in  the  larger  context  of  the  use  of  block  ciphers,  they  have  no  bearing  in  the  context  of  the 
usage  of  the  ciphers  studied  in  this  thesis. 

Some  block  ciphers  have  a  third  input  to  the  encryption  function,  in  addition  to  the  key 
and  plaintext,  called  a  tweak.  The  first  widely  known  cipher  algorithm  to  use  a  tweak  was 
probably  the  Hasty  Pudding  AES  candidate  [10].  A  tweak  provides  additional  keying  bits 
that,  unlike  the  key,  are  not  necessarily  secret  [11].  The  tweak  is  normally  stored  or  sent 
along  with  the  plaintext.  The  purpose  of  a  tweak  is  to  improve  the  security  of  the  cipher 
with  the  additional  non-secret  bits.  Ideally,  no  two  plaintexts  should  be  encrypted  with  the 
same  combination  of  key  and  tweak.  The  cipher  must  still  be  secure  even  if  that  is  the 
case,  as  the  tweak  input  may  not  be  used  at  all  in  some  applications.  Worse,  it  could  be 
controlled  by  an  adversary.  Like  the  other  inputs  to  a  block  cipher,  the  output  should  exhibit 
the  avalanche  property  with  respect  to  the  tweak. 
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2.3  Automatic  Link  Establishment  Systems 

As  mentioned  in  Section  1.1,  the  performance  of  HF  radio  systems  is  highly  dependent 
on  ionospheric  conditions.  The  most  important  factors  affecting  the  properties  of  the 
ionosphere  with  respect  to  HF  radio  are:  time  of  day  and  year,  geographic  location,  and  the 
sun’s  11-year  cycle.  Additionally,  equipment  parameters  such  as  output  power,  antennas, 
and  selected  modulation  also  affect  propagation.  Figure  2.2  shows  a  HF  radio  propagation 
diagram  generated  with  the  Voice  of  America  Coverage  Analysis  Program  (VOACAP)  [12]. 
The  diagram  shows  maximum  usable  frequency  (MUF),  lowest  usable  frequency  (LUF), 
and  frequency  of  optimum  transmission  (FOT)  for  communication  between  two  geographic 
locations  during  different  times  of  day.  It  should  be  apparent  that  propagation  conditions 
change  over  time. 


UTC  hour 


Figure  2.2.  Example  HF  propagation  diagram  showing  maximum  usable 
frequency  (MUF),  lowest  usable  frequency  (LUF),  and  frequency  of  optimum 
transmission  (FOT)  between  Grimeton,  Sweden,  and  Long  Island,  New  York, 
during  September  2017.  Produced  using  VOACAP. 


Understanding  and  utilizing  the  ionospheric  conditions  correctly  as  an  HF  radio  operator 
requires  training  and  experience.  ALE  technologies  were  created  to  offset  most  of  this  need 
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with  technology.  In  an  ALE  system,  a  computer  selects  transmission  parameters  such  as 
frequency  and  power  using  a  model  of  the  ionosphere  fed  with  a  large  number  of  parameters. 
In  addition,  some  ALE  systems  perform  regular  soundings  where  one  or  more  stations  in  an 
ALE  radio  network  transmit  sounding  signals  that  are  used  by  receiving  stations  to  measure 
current  propagation  conditions  on  different  frequencies,  thereby  improving  the  model’s 
predictive  accuracy  [1]. 

The  first  ALE  systems  were  proprietary  developments  by  a  number  of  commercial  vendors. 
Interoperability  suffered  as  a  consequence.  In  response  to  this,  2G  ALE  was  developed  and 
standardized  in  MTL-STD-1 88-141  [4]  and  FS-1045  [13].  This  enabled  interoperability 
between  radios  from  different  manufacturers  as  well  as  between  organizations  [1], 

Radios  in  ALE  systems  exchange  messages  in  the  form  of  protocol  data  units  (PDU).  All 
2G  ALE  PDUs  are  exactly  24-bits  long  and  consist  of  a  three-bit  preamble  and  three  seven- 
bit  ASCII  characters.  A  typical  call  from  one  2G  ALE  radio  to  another  with  a  request  to 
establish  a  communications  link  will  consist  of  three  PDUs.  The  first  two  are  identical  and 
contain  the  intended  receiver’s  address  while  the  third  contains  the  sender’s  address.  For 
example,  the  first  and  second  would  contain  the  preamble  <T0>  (010  in  binary)  followed 
by  a  three  ASCII  character  address,  such  as  SAM.  This  example  would  be  hex  encoded  as 
54e0cd.  The  third  PDU  in  this  example  could  contain  the  preamble  <TIS>  (101  in  binary) 
followed  by  the  sender  address  JOE  and  be  hex  encoded  as  b2a7c5  [5]. 

An  obvious  requirement  for  two  radios  to  be  able  to  communicate  is  that  the  sender  transmits 
on  the  same  frequency  as  the  one  on  which  the  receiver  is  listening.  To  adopt  to  varying 
transmission  conditions,  ALE  radio  networks  must  use  several  different  frequencies.  This 
is  achieved  by  having  all  idle  radios  in  a  network  scan  a  predefined  list  of  frequencies  by 
sequentially  tuning  to  them  for  a  short  period  of  time,  called  the  dwell  time,  and  listening 
for  ALE  PDUs.  A  radio  that  needs  to  establish  a  link  with  another  radio  selects  a  suitable 
frequency  using  the  ionospheric  transmission  model  and  starts  transmitting  PDUs  on  that 
frequency.  Because  the  radios  scan  frequencies  asynchronously,  some  number  of  tries  will 
be  needed  for  the  intended  receiver  to  register  the  transmission.  When  a  radio  detects  a 
PDU  intended  for  it,  it  stops  scanning  and  transmits  a  reply. 

The  asynchronous  scanning  is  a  source  of  some  inefficiency.  The  next  generation  of 
the  standard,  3G  ALE,  solved  this  problem  utilizing  synchronous  scanning,  i.e.,  all  radio 
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stations  in  the  network  tune  to  the  same  frequency  at  the  same  time.  Since  a  transmitting 
radio  knows  which  frequency  an  intended  receiver  is  tuned  to  at  any  instant,  only  a  single 
transmission  will  normally  be  required.  A  requirement  for  this  to  work  is  for  all  stations’ 
internal  clocks  to  be  synchronized  with  an  accuracy  less  than  the  dwell  time.  This  can 
be  done  by  manual  input,  with  the  help  of  external  timing  input  from  a  global  navigation 
satellite  system  receiver,  or  through  asynchronous  over-the-air  synchronization  with  another 
ALE  station. 

3G  ALE  uses  26-  and  48-bit  PDUs  that  have  different  formats  from  2G  ALE.  The  addressing 
format  is  different  as  well,  with  3G  ALE  using  binary  addresses.  Additionally,  3G  ALE 
PDUs  contain  cyclic  redundancy  check  (CRC)  checksums,  allowing  for  error  detection. 

To  prevent  unauthorized  users  from  linking  with  radios  in  an  ALE  radio  network,  or 
to  recover  information  from  intercepted  PDUs,  the  standards  specify  an  optional  linking 
protection  scheme  that  allows  for  encryption  of  transmitted  PDUs.  ALE  linking  protection 
has  five  application  levels  (AL):  AL-0  through  AL-4.  Their  definitions  from  [4]  are  shown 
in  Table  2.1. 

Table  2.1.  ALE  linking  protection  application  levels.  Adapted  from  [4], 


Application  level 

Definition 

AL-0 

unprotected  application  level 

AL-1 

unclassified  application  level 

AL-2 

unclassified  enhanced  application  level 

AL-3 

unclassified  but  sensitive  application  level 

AL-4 

classified  application  level 

The  first  application  level,  AL-0,  corresponds  to  all  encryption  being  turned  off.  Application 
levels  AL-1  and  AL-2  use  the  SoDark  cipher  algorithms  and  are  described  as  “for  general 
U.S.  Government  and  commercial  use.”  The  difference  between  AL-1  and  AL-2  is  that 
the  latter  uses  a  shorter  protection  interval  (PI):  two  seconds  instead  of  60  seconds.  The 
tweak  (see  Section  2.2  and  Chapter  3)  used  for  encryption  of  PDUs  remains  the  same  for 
the  duration  of  a  PI.  This  makes  AL-1  somewhat  vulnerable  to  replay  attacks. 
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AL-3  and  AL-4  use  hardware  cipher  modules  developed  and  approved  by  the  National 
Security  Agency  (NS A).  AL-4  is  the  only  AL  intended  for  the  protection  of  classified 
information.  These  application  levels  are  outside  the  scope  of  this  thesis. 

The  tweak,  which  is  referred  to  as  seed  in  the  standards,  is  a  64-bit  value  used  to  prevent 
replay  attacks.  Chapter  3  describes  how  the  tweak  is  used  by  the  linking  protection  cipher 
in  the  ALE  protocols.  It  contains  the  transmission  frequency,  PI  number  (i.e.,  transmission 
time),  date,  and  the  word  number  (i.e.,  the  order  of  the  PDU  in  the  current  transmission). 
The  advantage  of  using  that  data  is  that  it  is  implicitly  known  by  the  receiver  and  does  not 
need  to  be  transferred  along  with  the  ciphertext.  Table  2.2  shows  the  tweak  data  structure. 


Table  2.2.  Construction  of  tweak  used  in  ALE  linking  protection.  Bit  number 
1  is  the  most  significant  bit  and  64  the  least  significant.  Adapted  from  [4], 


Field 

Month 

Day 

PI 

Word  number 

Zero  pad 

Frequency  (BCD) 

Bits 

1-4 

5-9 

10-26 

27-34 

35-36 

37-64 

2.4  Cryptanalysis 

Cryptanalysis  is,  as  the  name  implies,  the  analysis  of  cryptosystems.  In  particular,  crypt¬ 
analysis  normally  aims  to  establish  the  bounds  of  a  cryptosystem’s  security.  It  is  practiced 
by  both  users  of  cryptosystems  and  their  adversaries.  Cryptosystem  users  perform  crypt¬ 
analysis  to  ensure  there  are  no  ways  to  recover  information  about  plaintexts,  ciphertexts,  or 
keys.  Their  adversaries  do  cryptanalysis  in  the  hope  of  finding  such  ways  [7]. 

In  general,  any  ability  to  recover  information  that  requires  less  effort  than  trying,  on  average, 
half  of  the  possible  keys  is  considered  a  break  for  cryptanalytic  purposes.  For  example, 
there  exists  a  key  recovery  attack  for  AES  with  computational  complexity  proportional  to 
21261,  while  the  average  computational  complexity  of  a  brute  force  attack  is  2127.  AES  is 
therefore  broken  in  theory.  Since  the  complexity  of  this  attack  is  still  astronomical,  however, 
and  requires  a  very  large  amount  of  data,  the  cipher  is  not  broken  in  practice  and  is  still 
considered  safe  to  use  [14]. 

The  starting  point  for  cryptanalysis  on  a  particular  cipher  is  usually  to  study  its  mathematical 
description  and  to  apply  cryptanalytic  techniques  that  have  been  successful  with  other  similar 
ciphers.  A  common  approach  is  to  start  by  analyzing  versions  of  the  cipher  with  reduced 
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security.  This  can,  for  example,  be  a  version  that  has  a  reduced  number  of  rounds  or  in  an 
improbable  setting,  such  as  having  the  entire  codebook  for  a  given  key.  The  insights  from 
these  attacks  may  then  provide  tools  and  methods  for  attacking  the  cipher  with  more  rounds 
or  in  more  generalized  settings  [3]. 

Perhaps  the  single  most  important  property  for  a  cipher  to  have  if  it  is  to  be  resistant  to 
cryptanalysis  is  nonlinearity.  This  property  follows  directly  from  Shannon’s  diffusion  and 
confusion  properties.  A  completely  linear  cipher  can  simply  be  described  as  a  system  of 
linear  equations  that  can  be  solved  by  Gaussian  elimination  given  a  very  small  number 
of  plaintexts.  Since  systems  of  linear  equations  can  be  solved  in  polynomial  time,  this  is 
expected  to  be  faster  than  an  exhaustive  search  of  the  key  space,  even  for  very  small  key 
spaces. 

A  nonlinear  cipher  on  the  other  hand,  must  be  described  as  a  system  of  equations  of  higher 
order.  Such  a  system  of  equations  is  reducible  to  the  multivariate  quadratic  (MQ)  problem, 
which  is  non-deterministic  polynomial-time  (NP)-hard.  Thus,  it  has  complexity  0(2an), 
0  <  a  <  1  in  the  case  of  n  binary  variables.  In  most  cases,  this  makes  it  harder  to  attack  a 
cipher  this  way  than  the  brute  force  approach  of  testing  all  the  keys,  which  has  complexity 
equivalent  to  2A_1  encryptions  on  average,  where  k  is  the  key  length  in  bits. 

There  are  exceptions:  In  some  cases,  it  is  possible  to  linearize  the  system  of  equations,  i.e., 
to  replace  all  nonlinear  terms  with  new  variables  and  then  solve  the  resulting  system  of 
linear  equations.  This  will  yield  a  number  of  spurious  solutions  that  must  be  filtered  out. 
Linearization  of  a  MQ  system  of  equations  is  only  possible  if  it  is  sufficiently  sparse  and 
overdefined.  Another  exception  is  the  use  of  SAT  solvers  or  constraint  solvers  to  solve  the 
system  of  equations.  SAT  solvers  are  able  to  solve  large  systems  of  Boolean  equations  with 
comparatively  high  speed  [15]. 

While  many  attacks  are  specific  to  certain  ciphers,  there  are  a  number  of  attacks  that  work 
on  large  classes  of  block  ciphers.  Some  examples  of  such  generic  attacks  are  given  in  the 
following  sections. 


12 


2.4.1  Brute  Force  Attacks 

A  brute  force  attack  works  by  trying  every  possible  key  until  the  right  one  is  found.  On 
average,  half  the  key  space  needs  to  be  searched  before  the  correct  key  is  found  so,  for  a 
k-bit  key,  the  effort  is  proportional  to  2k~ 1 .  The  only  protection  against  brute  force  key 
search  is  to  ensure  that  the  key  space  is  large  enough  for  the  attack  to  be  intractable,  at  least 
during  the  expected  period  for  which  the  encrypted  data  needs  to  be  protected.  In  general, 
a  cipher  is  considered  secure  if  no  attacks  exist  that  are  faster  than  an  exhaustive  search 
in  practice  and  the  size  of  the  key  space  makes  a  brute  force  search  impossible.  Various 
recommendations  for  minimum  key  lengths  exist.  Table  2.3  compiles  the  recommendations 
from  [16]  and  [17].  Among  the  sources  consulted,  there  is  consensus  that  a  128-bit  key  size 
provides  good  security,  64  bits  or  less  provides  no  security  in  practice,  and  80  bits  is  the 
smallest  key  size  that  provides  any  measure  of  security. 


Table  2.3.  Key  length  recommendations.  Adapted  from  [16],  [17], 


Fevel  of  security 

Key  size  (bits) 

Knudsen  &  Robshaw  (2010) 

ECRYPT  II  (2012) 

32 

attacks  in  real-time  by  individuals 

40 

easy  to  break 

very  short-term  protection 

64 

practical  to  break 

80 

not  currently  feasible 

smallest  general-purpose  level 

96 

legacy  standard  level 

112 

medium-term  protection 

128 

very  strong 

long-term  protection 

256 

exceptionally  strong 

foreseeable  future 

An  efficient  brute  force  attack  requires  an  efficient  implementation  of  the  cipher  function. 
Application-specific  integrated  circuits  (ASIC)  built  specifically  for  breaking  the  cipher  in 
question  is  the  fastest,  but  most  expensive,  technology.  Constructing  an  ASIC  to  perform 
brute  force  key  search  requires  custom  integrated  circuit  design  and  manufacturing,  which  is 
expensive  and  out  of  reach  for  individuals  and  small  organizations.  In  1998,  the  Electronic 
Frontier  Foundation  (EFF)  built  an  ASIC-based  computer,  Deep  Crack ,  that  could  break 
the  56-bit  Digital  Encryption  Standard  (DES)  cipher  in  less  than  a  week.  The  budget  for 
the  project  was  about  200,000  U.S.  dollars  [18].  This  is  an  example  of  a  medium-size 
organization’s  ability  to  break  56-bit  ciphers  in  the  late  1990s. 
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A  slower  and  cheaper,  but  still  quite  efficient,  way  to  perform  a  brute  force  search  is 
to  employ  field-programmable  gate  arrays  (FPGA).  FPGAs  are  reconfigurable  hardware 
gate  networks  that  enable  efficient  implementations  and  parallelization  of  calculations  at 
comparatively  low  cost.  Cloud  FPGA  computing  services  as  well  as  FPGA  expansion  cards 
for  personal  computers  are  available.  This  could  enable  the  use  of  FPGAs  for  brute  force 
key  searches  by  individuals  and  organizations  of  any  size. 

Graphics  processing  units  (GPU)  are  primarily  designed  for  real-time  rendering  of  graph¬ 
ics  on  personal  computers.  Yet,  their  design  also  makes  them  useful  for  highly  parallel 
computation — a  single  modern  GPU  can  contain  thousands  of  processor  cores.  This  has 
led  to  the  emergence  of  general-purpose  computing  on  graphics  processing  units  (GPGPU) 
programming  frameworks,  such  as  OpenCL  and  CUDA,  specifically  tailored  for  GPU  com¬ 
puting.  These  frameworks  are  used  to  write  programs  that  solve  various  hard  problems 
encountered  in  a  wide  range  of  fields. 

Lastly,  brute  force  key  search  can  be  done  with  central  processing  units  (CPU)  in  general 
purpose  computers.  Except  for  ciphers  that  have  been  specifically  engineered  to  resist  the 
aforementioned  methods,  this  tends  to  be  the  slowest  method.  To  their  advantage,  however, 
are  shorter  development  time  and  the  possibility  of  using  existing  software  implementations 
of  the  cipher.  In  addition,  an  organization  can  use  the  computer  infrastructure  it  already  has 
in  place  to  perform  the  key  search.  There  are  also  examples  of  the  Internet  being  used  to 
leverage  the  power  of  computers  all  over  the  world  to  perform  brute  force  key  search. 

Regardless  of  the  hardware  used,  the  fastest  implementations  of  ciphers  are  in  forms  that 
regard  the  cipher  as  a  network  of  logic  gates  rather  than  as  an  imperative  computer  program. 
In  ASICs  and  FPGAs,  this  enables  a  design  that,  in  effect,  tests  one  key  per  clock  cycle.  In 
GPUs  and  CPUs,  this  enables  bitslicing  implementations.  In  a  bitslicing  implementation, 
each  variable  in  the  program  represents  one  bit  of  state  and  the  entire  cipher  is  implemented 
in  software  as  bitwise  logic  operations.  This  enables  instruction  level  parallelism,  where 
every  instruction  operates  on  a  number  of  parallel  encryptions  or  decryptions.  The  exact 
number  is  dependent  on  the  platform’s  register  size.  With  modern  processors  that  have 
single  instruction,  multiple  data  (SIMD)  instruction  sets  with  registers  as  wide  as  256  or 
512  bits,  this  means  that  that  many  encryptions  or  decryptions  can  be  performed  in  parallel 
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on  a  single  processor  core.  Additionally,  bit  level  permutations  are  performed  at  no  cost  at 
all  in  bitslicing  implementations  [19]. 

Finding  the  most  efficient  logic  gate  representation  of  nonlinear  parts  of  the  cipher,  such  as 
S-boxes,  is  an  NP-hard  problem.  Without  a  clear  mathematical  description  of  an  S-box,  a 
partial  search  of  the  solution  space  using  a  heuristic  algorithm  may  be  the  only  way  to  find 
an  efficient,  but  non-optimal,  solution. 

2.4.2  Time-Memory  Trade-Off  Attacks 

Time-memory  trade-off  (TMTO)  attacks  exist  for  all  block  ciphers.  The  simplest  example 
is  to  construct  a  dictionary  that  associates  any  given  plaintext-ciphertext  pair  with  a  key. 
For  most  ciphers  that  are  in  practical  use,  the  storage  space  required  to  mount  such  an  attack 
makes  this  impossible.  For  a  cipher  with  a  block  and  key  size  of  n  bits,  the  storage  space 
required  for  the  lookup  table  would  be  n  ■  22"  bits.  The  required  space  can  be  reduced  to 
n  ■  2n  bits  if  the  dictionary  is  restricted  to  a  single  plaintext. 

Heilman  [20]  describes  a  TMTO  attack  that  allows  the  attacker  to  choose  an  almost  arbitrary 
point  on  a  trade-off  curve  between  the  extremes  provided  by  the  brute  force  and  dictionary 
attacks.  A  TMTO  attack  starts  by  creating  reusable  tables  for  a  certain  plaintext  by  per¬ 
forming  precomputations  of  a  complexity  equivalent  to  the  brute  force  recovery  of  a  single 
key.  Given  a  ciphertext  corresponding  to  that  plaintext,  the  key  can  be  recovered  by  quickly 
regenerating  only  parts  of  the  precomputations  with  the  help  of  the  tables.  This  way,  the 
key  can  be  recovered  significantly  faster  than  by  brute  force  alone.  TMTO  attacks  have 
been  used  to  perform  practical  breaks  of  ciphers  that  are  in  current  use.  One  of  the  more 
notable  examples  of  a  cipher  broken  by  this  is  the  A5/1  cipher  used  in  the  GSM  standard  for 
mobile  telephony  [21].  For  details  on  the  attack,  the  reader  is  referred  to  Heilman’s  original 
paper  [20]  or  to  the  description  in  [16]. 

2.4.3  Meet-in-the-Middle  Attacks 

Meet-in-the-middle  (MITM)  attacks  are  an  example  of  a  type  of  structural  attack.  They 
exploit  the  fact  that  some  ciphers  can  be  divided  into  two  parts,  where  neither  part  is 
dependent  on  the  full  key.  This  attack  type  was  first  described  in  [22],  where  possible 
improvements  to  the  DES  algorithm  are  investigated.  When  considering  double  encryption 
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with  DES  using  two  different  keys,  the  authors  show  that  such  a  system  can  be  broken 
with  effort  proportional  to  257,  despite  the  system  having  a  112-bit  key.  Thus,  the  double 
encryption  with  two  independent  keys  adds  only  a  single  bit  of  security.  As  an  example, 
Algorithm  2.1  performs  a  MITM  attack  on  a  product  cipher  f  -  h  o  g  using  two  known 
plaintext-ciphertext  pairs. 


Algorithm  2.1  Perform  a  meet-in-the-middle  attack  on  a  product  cipher  /  =  hog.  Adapted 
from  [22]. 

1:  procedure  MeetInTheMiddle(Pi,  Ci,  Pi,  Co) 

2:  L  <—  empty  list 

3:  for  all  k\  do 

4:  V  *- gtiiPl) 

5:  L.append(u,  k\) 

6:  end  for 

7:  for  all  ko  do 

8:  HM-  h~](Ci) 

9:  k\  L[w ] 

10:  if  hk2(gkl(P2))  =  Co  then 

11:  Print(Ti,  ko) 

12:  end  if 

13:  end  for 

14:  end  procedure 


In  the  case  of  Double  DES  (2DES),  we  expect  to  find  the  key  in  about  257  DES  operations 
using  256  56-bit  blocks  of  memory.  For  that  reason,  DES  was  eventually  strengthened 
through  Triple  DES  (3DES),  which  is  still  vulnerable  to  the  same  attack,  but  with  an  attack 
complexity  of  about  2112  DES  operations.  This  was  considered  sufficiently  prohibitive  at 
the  time. 


2.4.4  Differential  Cryptanalysis 

Differential  cryptanalysis  is,  together  with  linear  cryptanalysis,  one  of  the  strongest  known 
general  attacks  on  block  ciphers.  It  was  first  described  in  the  open  literature  by  Biham 
and  Shamir  [23].  Attacks  based  on  differential  cryptanalysis  work  with  differences,  called 
differentials,  between  inputs  and  outputs  of  parts  of  a  cipher.  Commonly,  the  differentials 
are  defined  as  the  bitwise  XOR  of  two  values,  although  other  definitions  such  as  modular 
addition  can  be  used. 
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The  basic  idea  of  differential  attacks  is  to  distinguish  the  output  of  a  certain  function  from 
random  by  considering  the  probability  that  a  certain  output  differential  is  generated  by  a 
certain  input  differential  or  vice  versa.  Since  S-boxes  are  the  only  source  of  nonlinearity 
in  many  ciphers,  the  study  of  their  differential  properties  is  usually  an  important  part  of 
cryptanalysis.  An  example  4x4  S-box  from  [16]  is  shown  in  Table  2.4  and  Table  2.5  shows 
the  parts  of  its  difference  distribution  table  (DDT)  that  correspond  to  inputs  that  are  bitwise 
complements.  It  is  clear  that  the  differentials  shown  in  the  rightmost  column  are  not  evenly 
distributed.  The  value  d,  for  example,  appears  with  probability  jjj  and  12  of  the  16  possible 
values  have  probability  0. 


Table  2.4.  An  example  4x4  S-box  vulnerable  to  differential  cryptanalysis. 
Adapted  from  [16]. 
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Table  2.5.  Excerpt  from  the  difference  distribution  table  of  the  example 
S-box  from  Table  2.4.  Adapted  from  [16]. 
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The  DDTs  of  the  S -boxes  in  a  cipher,  together  with  knowledge  of  its  round  structure, 
can  be  used  to  construct  relations  between  inputs  and  outputs  of  a  number  of  consecutive 
rounds  that  have  probabilities  much  higher  or  lower  than  expected  for  a  cipher  adhering  to 
Shannon’s  diffusion  property. 

2.4.5  Linear  Cryptanalysis 

Linear  cryptanalysis  was  first  described  by  Matsui  in  his  cryptanalysis  of  the  DES  cipher 
[24].  As  with  differential  cryptanalysis,  it  provides  a  method  for  discovering  and  using 
non-random  statistical  properties  of  the  cipher.  This  time,  the  property  used  is  the  parity  of 
certain  bit  positions  in  the  input  and  output. 

Again,  an  example  4x4  S-box  from  [16],  shown  in  Table  2.6,  illustrates  the  concept.  The  top 
two  rows  in  the  table  show  the  S-box,  while  the  additional  two  bottom  rows  show  the  parity 
of  certain  bits  of  its  input  and  output,  respectively,  selected  by  the  masks  a  =  (1, 0, 0, 1)  and 
j8  =  (0, 0, 1, 0).  The  two  bottom  rows  differ  in  all  columns,  except  for  x  =  1  and  x  =  f.  This 
means  that  the  relation  (a  •  x)  ©  1  =  f5  •  Six)  holds  with  probability  j|,  which  is  a  significant 
difference  from  the  \  probability  expected  from  a  S-box  with  good  nonlinearity. 


Table  2.6.  An  example  4x4  S-box  vulnerable  to  linear  cryptanalysis.  Adapted 
from  [16]. 
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The  equivalent  to  a  DDT  in  linear  cryptanalysis  is  the  linear  approximation  table  (LAT), 
which  shows  the  deviation  from  the  expected  probability  of  \  for  all  pairs  of  input  and 
output  masks.  As  with  differential  cryptanalysis,  the  LAT  in  linear  cryptanalysis  is  used 
to  create  relations  between  the  parity  of  inputs  and  outputs  of  several  consecutive  cipher 
rounds.  The  relations  can  then  be  used  in  attacking  the  cipher. 

For  a  more  in-depth  description  of  the  methods  of  linear  and  differential  cryptanalysis,  the 
reader  is  referred  to  the  excellent  tutorials  in  [16]  and  [25]. 


18 


CHAPTER  3: 

The  SoDARK  Family  of  Algorithms 


3.1  Background 

The  Lattice  cipher  algorithm  is  specified  in  [4].  It  is  a  24-bit  block  cipher  that  uses  a  56-bit 
key  and  a  64-bit  tweak.  It  has  eight  rounds  and  is  used  to  encrypt  24-bit  PDUs  sent  by  the 
2G  ALE  protocol.  A  version  called  SoDark-3  is  used  in  the  3G  ALE  standard  to  encrypt 
24-bits  of  the  26-bit  PDUs.  It  is  identical  to  the  original  Lattice  algorithm,  except  that  it 
uses  16  rounds.  Since  3G  ALE  also  uses  48-bit  PDUs,  SoDark-3  has  been  extended  into  a 
version  with  48-bit  block  size  called  SoDark-6. 

The  cipher  was  developed  specifically  for  the  ALE  application.  The  main  purpose  of  the 
algorithm,  according  to  [5],  is  to  prevent  unauthorized  linking  with  radios  that  are  part  of 
an  ALE  radio  network.  The  reference  specifically  mentions  both  replay  attacks,  where  a 
previously  sent  legitimate  PDU  is  replayed  by  an  adversary,  as  well  as  attacks  where  the 
adversary  is  actively  trying  to  recover  the  key. 

Further  insight  is  given  by  [1],  which  lists  the  following  seven  design  requirements  for  the 
original  Lattice  algorithm: 

(a)  transparency  to  ALE  protocols; 

(b)  self-synchronization; 

(c)  minimum  impact  on  scanning  dwell  time; 

(d)  24-bit  block  operation; 

(e)  channel-  and  time- varying; 

(f)  moderate  computational  requirements;  and 

(g)  unclassified  algorithm. 

Requirements  a,  b,  c,  and  d  all  have  the  same  root  cause  in  that  the  2G  ALE  standard  uses 
24-bit  PDUs  and  non-synchronous  frequency  scanning.  A  station  in  an  HF  radio  network 
that  uses  ALE  must  be  able  to  switch  to  a  frequency  and  immediately  start  receiving  PDUs. 
Since  the  dwell  time,  i.e.,  the  time  the  station  listens  to  any  given  frequency,  is  quite 
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short,  any  received  non-authentic  PDU  must  not  cause  an  interruption  in  scanning.  The 
linking  protection  cipher  is  also  an  optional  feature  in  the  standard  and  must  be  a  drop-in 
replacement  in  the  sense  that  no  more  data  than  the  24  or  48  bits  allocated  in  the  transmission 
format  can  be  used  when  linking  protection  is  enabled. 

Requirement  e  is  needed  if  the  cipher  is  to  be  semantically  secure.  Without  this,  it  would  be 
trivially  vulnerable  to  traffic  analysis  and  replay  attacks.  In  particular,  the  short  block  size 
would  enable  an  attacker  to  quickly  compile  relevant  parts  of  the  codebook  for  a  given  key. 

The  last  two  requirements,  f  and  g,  stem  from  the  fact  that  the  ALE  algorithm  and  cipher 
are  meant  to  be  used  by  field  radios. 

The  round  function  consists  of  S-box  lookups  and  XOR  operations,  which  makes  the  S- 
box  the  only  nonlinear  component  of  the  cipher.  Table  3.1  shows  the  S-box  lookup  table. 
Neither  [1]  nor  [4]  nor  [5]  describes  how  the  S-box  was  generated  or  the  criteria  for  its 
selection. 


Table  3.1.  The  Lattice  and  SoDark  S-box.  Adapted  from  [4], 
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3.2  Notation 

This  section  introduces  the  notation  used  in  the  descriptions  and  cryptanalysis  of  the  SoDark 
family  of  ciphers  in  this  and  following  chapters. 

The  bitwise  exclusive  or  (XOR)  operation  is  denoted  by  ©. 

S-box  lookups  are  denoted  by  5  and  inverse  S-box  lookups  by  s_1. 

Concatenation  of  variables  is  denoted  by  ||. 

The  full  plaintext  is  denoted  by  P,  the  full  ciphertext  by  C,  the  full  key  by  VC  and  the  full 
tweak  by  T. 

The  ciphers  described  are  byte  oriented.  Input  and  output  bytes  to  and  from  each  round 
are  denoted  by  the  letters  A  through  F,  with  the  letter  A  representing  the  most  significant 
byte  of  the  state  and  the  other  letters  representing  the  following  bytes  in  falling  order.  To 
differentiate  between  state  in  different  rounds,  the  superscript  in  parenthesis  is  used  where 
Air~l)  represents  the  input  to  and  Aln  the  output  from  the  rth  round. 

In  the  cryptanalysis,  the  state  of  several  parallel  encryptions  are  studied  and  subscripts  are 
used  to  differentiate  the  parallel  variables.  For  example,  Aj0)  and  A(y>>  represent  the  most 
significant  input  byte  in  two  parallel  encryptions. 

Differentials,  i.e.,  XOR  differences  between  the  same  state  variable  in  two  parallel  encryp¬ 
tions  are  denoted  with  the  A  character.  Continuing  the  previous  example,  AA'0-1  would  be 
the  differential  of  the  most  significant  plaintext  byte. 

In  some  cases  it  will  be  convenient  to  study  partial  decryptions  of  the  parts  of  a  round  that 
are  not  key-dependent.  The  notation  A^  is  used  for  such  partial  decryptions. 

Use  of  the  key  and  tweak  is  also  byte  oriented.  A  certain  byte  is  denoted  by  k\  for  the  key  and 
t\  for  the  tweak,  starting  with  the  number  one  for  the  most  significant  byte.  Multiple -byte 
round  keys  are  denoted  by  K.  Different  versions  of  tweak  bytes  in  parallel  encryptions  are 
denoted  by  a  comma  in  the  subscript.  For  example,  t\  2  denotes  the  most  significant  tweak 
byte  in  the  second  parallel  encryption. 
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3.3  24-bit  Version  (SoDark-3) 

Each  round  operates  on  the  incoming  24-bit  word  by  splitting  it  into  three  bytes  A<'r~]\ 
and  C^r~l\  with  A*7-1)  containing  the  most  significant  bits  and  C(r~]>  the  least 
significant.  It  then  calculates  three  output  bytes  A(r\  BirK  and  C(r|  in  the  following  manner: 


A(r)  =  s(A(r-1)®JB(,-1)®ki) 

(3.1) 

C(r)  =  s(cir-1)®B<r-1)®  k2) 

(3.2) 

B(r)  =  ©  A(r)  ©  C(r)  ©  it3) 

(3.3) 

where  s  denotes  the  S-box  lookup  function  and  k\,k,2,  and  are  the  most,  middle,  and  least 

significant  parts  of  the  round  key.  Figure  3.1  shows  the  encryption  process, 
performed  by  inverting  the  operations: 

Decryption  is 

B(r-1)  =  S-1  ©  A(r)  ©  C(r)  ^ 

(3.4) 

A(r_1)  =  5_1  (A(r)j  ©  B(r~l)  ©  h 

(3.5) 

C(r_1)  =  5-1(c(,'))  (B  Bir-l)  ®  k2. 

(3.6) 

The  key  schedule  is  completely  linear.  For  each  round,  three  bytes  of  key  and  three  bytes  of 
tweak  are  XORed  to  create  a  24-bit  round  key.  The  bytes  are  used  in  order  and  the  different 
lengths  of  the  key  and  tweak  ensure  that  the  round  keys  are  different. 

The  round  keys  for  the  first  16  rounds  are  listed  in  Table  3.2.  As  is  apparent  from  the  table, 
and  assuming  the  tweak  is  known,  knowledge  of  any  round  key  will  reveal  parts  of  at  least 
half  of  the  round  keys. 
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Figure  3.1.  The  first  two  rounds  of  the  SoDark-3  algorithm. 
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Table  3.2.  Lattice  and  SoDark-3  key  schedule. 
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3.4  48-bit  Version  (SoDark-6) 

The  version  of  the  algorithm  with  48-bit  block  length,  SoDark-6,  is  a  direct  extension  of 
SoDark-3.  Figure  3.2  shows  the  encryption  process:  Each  round  splits  the  incoming  48-bit 
word  into  six  bytes  B^r~l\  C^_1\  D^r~l\  E(r~l\  and  F(r~l')  with  A*7-1)  containing 

the  most  significant  bits  and  F[r~]  )  the  least  significant.  It  then  calculates  six  output  bytes 
in  the  following  manner: 


A(r)  =  v(A(r_1)  ©  ©  F(r_1)  ©  ki)  (3.7) 

C(r)  =  v  ©  C(r_1)  ©  F>(r_1)  ©  fc2)  (3.8) 

E<r)  =  ©  F(r_1)  ©  F(r_1)  ©  k3J  (3.9) 

B(r)  =  s (A(r)  ©  ©  C(r)  ©  fc4)  (3.10) 

D(r)  =  s(c(r)  ©  ©  F(r)  ©  k5\  (3.11) 

Fir)  =  s(E{r)  ©  F(''_1)  ©  A(r)  ©  k6)  .  (3.12) 


Again,  k,  denotes  the  /th  byte  of  the  round  key  where  k\  is  the  most  significant.  The  key 
schedule  is  analogous  to  the  one  used  by  the  24-bit  versions  and  is  shown  in  Table  3.3. 
Decryption  is  also  analogous  to  the  24-bit  version: 


B(r~ 1}  = 

© 

© 

-s 

© 

(3.13) 

D(r~ 1}  =  5_1(D(r)) 

©  C(r)  ©  F(r)  ©  k5 

(3.14) 

F('-D  = 

©  F('_)  ©  A(r)  ©  ke 

(3.15) 

A('-1}  = 

©  F(r_1)  ©  F(,'_1)  ©  k\ 

(3.16) 

C(r_1)  =  F'jcW) 

©  ©  k2 

(3.17) 

f(-1)  =  ,-1(fW) 

©  D(r_1)  ©  F(r_1)  ©  k3. 

(3.18) 

One  notable  change  from  SoDark-3  is  that  the  mixing  of  inputs  “wraps  around”  in  the 
sense  that  the  most  and  least  significant  bytes  A  and  F  are  mixed  with  each  other. 
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Table  3.3.  SoDark-6  key  schedule. 


Round 

Kr 

k\ 

k2 

kq 

kq 

kq 

k6 

1 

k\  ©  t\ 

k2  ffi  t2 

kq  ffi  tq 

kq  ffi  tq 

kq  ffi  tq 

k6  ffi  tq 

2 

kq  ©  tq 

k\  ffi  tq 

k2  ffi  t\ 

kq  ffi  t2 

kq  ffi  tq 

kq  ffi  tq 

3 

k  6  ©  tq 

k-j  ffi  tq 

k\  ffi  tq 

k2  ffi  tq 

kq  ffi  tq 

kq  ffi  t2 

4 

k5  ©  tq 

k(,  ffi  tq 

kq  ffi  tq 

k\  ffi  tq 

k2  ffi  tq 

kq  ffi  ts 

5 

k4  ©  t\ 

kq  ffi  t2 

k6  ffi  tq 

kq  ffi  tq 

k\  ffi  tq 

k2  ffi  tq 

6 

k 3  ©  tq 

kq  ffi  tq 

kq  ffi  t\ 

k(,  ffi  t2 

kq  ffi  tq 

k{  ffi  tq 

7 

k2  ©  ts 

kq  ffi  tq 

kq  ffi  tq 

kq  ffi  tq 

k(,  ffi  tq 

kq  ffi  t2 

8 

k\  ©  tq 

k2  ffi  tq 

kq  ffi  tq 

kq  ffi  tq 

kq  ffi  tq 

k6  ffi  ts 

9 

kq  ©  t\ 

k\  ffi  t2 

k2  ffi  tq 

kq  ffi  tq 

kq  ffi  ts 

kq  ffi  tq 

10 

k(,  ©  tq 

kq  ffi  tq 

k{  ffi  tq 

k2  ffi  t2 

kq  ffi  tq 

kq  ffi  tq 

11 

k5  ©  ts 

k6  ffi  tq 

kq  ffi  tq 

k\  ffi  tq 

k2  ffi  tq 

kq  ffi  t2 

12 

kq  ffi  tq 

kq  ffi  tq 

k(,  ffi  tq 

kq  ffi  tq 

k{  ffi  tq 

k2  ffi  tq 

13 

kq  ffi  t\ 

kq  ffi  h 

k5  ffi  tq 

k(,  ffi  tq 

kq  ffi  tq 

k\  ffi  tq 

14 

k2  ©  ti 

kq  ffi  ts 

kq  ffi  tq 

kq  ffi  t2 

k6  ffi  tq 

kq  ffi  tq 

15 

k{  ffi  tq 

k2  ffi  tq 

kq  ffi  tq 

kq  ffi  ts 

kq  ffi  tq 

k(,  ffi  t2 

16 

kq  ffi  tq 

k\  ffi  tq 

k2  ffi  ts 

kq  ffi  tq 

kq  ffi  tq 

ks  ©  ts 
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3.5  S-box  Properties  and  Probable  Generation  and  Selec¬ 
tion  Criteria 

The  facts  that  none  of  the  available  descriptions  of  the  algorithm  mention  anything  about 
the  S-box  selection  criteria  and  that  the  S-box  is  the  only  nonlinear  part  of  the  cipher 
make  its  properties  important  to  study.  This  has  been  done  using  the  techniques  described 
in  [26].  Following  the  proposed  strategy  in  that  article,  the  S-box  was  first  studied  using 
what  the  authors  call  the  “Pollock”  technique.  The  name  alludes  to  the  20th  century 
abstract  expressionist  painter  and  simply  consists  of  plotting  the  S-box’s  LAT  and  DDT, 
studying  them  to  find  non-random  patterns.  The  visualizations  of  the  LAT  and  DDT  are 
shown  in  Figures  3.3  and  3.4,  respectively.  Inspection  of  them  does  not  reveal  any  obvious 
non-random  patterns. 

In  [27],  the  study  of  the  visual  representation  of  the  LAT  modulo  4  is  suggested.  It  notes 
that  the  presence  of  patterns  there  can  indicate  that  the  S-box  was  generated  by  a  Feistel 
network  with  a  low  number  of  rounds.  The  LAT  modulo  4  of  the  SoDark  S-box  is  shown  in 
Figure  3.5.  It  does  indeed  show  unmistakable  patterns.  For  that  reason,  the  possibility  that 
the  S-box  was  generated  by  a  Feistel  network  was  investigated  using  the  techniques  described 
in  [26].  Algorithm  2  from  that  article,  DecomposeFeistel,  was  implemented  to  generate 
a  CNF  representation  of  a  Feistel  network  that  can  generate  the  S-box.  This  representation 
was  then  used  as  input  to  the  SAT  solvers  CryptoMiniSat  [28]  and  Treengeling  [29], 
which  found  the  problem  unsatisfiable.  This  ruled  out  the  possibility  that  the  S-box  was 
generated  by  a  Feistel  network  with  bijective  round  functions  and  five  or  fewer  rounds.  The 
authors  of  [27]  have  noted  in  an  associated  presentation,  that  randomly  generated  S-boxes 
can  have  patterns  in  the  LAT  modulo  4  that  look  similar  to  those  in  S-boxes  generated  by 
Feistel  networks  with  more  than  five  rounds. 


With  the  hypothesis  that  the  S-box  was  generated  by  a  low  round  Feistel  network  falsified, 
the  possibility  that  the  S-box  was  a  randomly  selected  permutation  was  investigated.  In  [26], 
the  probability  distribution  of  the  coefficients  in  the  LAT  of  a  random  permutation  is  given 
as 


P  [ cLj  =  2z\ 


2  n-1 

\2"-2+z 


) 


2 


(3.19) 


where  P  [c,-j  =  2 z]  is  the  probability  that  a  particular  combination  of  input  and  output  bits 
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will  have  the  bias  2 z  and  n  is  the  S-box  width  in  bits.  This  probability  distribution  is 
plotted  together  with  the  distribution  of  the  SoDark  S-box  LAT  in  Figures  3.6  and  3.7. 
The  predicted  and  actual  distributions  track  each  other  very  closely  and  a  x 2  tcst  was  made 
to  establish  the  goodness  of  fit.  With  x2  =  91.3  and  38  degrees  of  freedom,  this  yields  a 
p- value  less  than  0.00001,  which  indicates  a  very  high  likelihood  that  the  SoDark  S-box 
was  chosen  randomly.  The  only  selection  criteria  was  probably  that  there  could  be  no  fixed 
points,  i.e.,  no  number  X  e  {0,  l}8  such  that  f(X)  =  X. 

It  should  be  noted  that  the  x2  test  assumes  that  each  trial  in  the  experiment  is  independent 
of  the  other  trials.  This  is  not  strictly  true  in  the  case  of  the  different  factors  in  a  LAT.  The 
X2  measure  is  still  used  here  though,  since  it  is  believed  to  be  a  good  approximation  of  the 
goodness  of  fit,  despite  non-independence  of  the  LAT  biases. 

The  fact  that  the  S-box  was  chosen  at  random  means  that  it  is  unlikely  to  have  the  properties 
that  are  considered  important  for  S-boxes  used  in  modern  ciphers.  In  particular,  randomly 
chosen  S-boxes  are  typically  vulnerable  to  both  linear  and  differential  cryptanalysis  [16]. 
That  this  is  the  case  here  can  be  understood  by  studying  Figures  3.3,  3.4,  3.6,  and  3.7. 
The  highest  linear  bias  is  Jjg,  slightly  higher  than  the  average  expected  bias  of  a  random 
permutation  (see  Figure  3.7).  In  regard  to  resistance  to  differential  cryptanalysis,  the  delta 
uniformity  (highest  value  in  the  DDT)  is  also  high  at  14.  This  can  be  put  in  contrast  to  the 
delta  uniformity  of  S-boxes  that  have  been  engineered  to  provide  resistance  to  differential 
cryptanalysis,  such  as  the  AES  S-box,  where  the  delta  uniformity  is  4.  The  large  number  of 
high  probability  differentials  in  the  DDT  also  means  that  it  has  a  large  number  of  differentials 
with  probability  zero. 
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Figure  3.3.  Graphic  representation  of  the  SoDark  S-box  LAT. 


Figure  3.4.  Graphic  representation  of  the  SoDark  S-box  DDT. 
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Figure  3.5.  Graphic  representation  of  the  SoDark  S-box  DDT  modulo  4. 
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In/out  combinations  with  bias  2 z  In/out  combinations  with  bias  2 z 


Expected  random  S-box  LAT  distribution 
SoDark  S-box  LAT  distribution 
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2  z 

Figure  3.6.  Linear  approximation  distribution. 


Expected  random  S-box  LAT  distribution 
SoDark  S-box  LAT  distribution 


2  z 

Figure  3.7.  Linear  approximation  distribution,  logarithmic  scale. 
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3.6  Equivalence  to  the  Even-Mansour  Construction 

Due  to  the  commutative  property  of  the  XOR  operation,  each  round  of  the  algorithm  can 
be  rewritten  as  a  function  of  one  24-bit  input  vector  P,  =  A  ||  B  ||  C: 

Pr+ 1  =  g(Pr  ©  Kr)  (3.20) 

where  g  is  a  bijective  mapping  g  :  {0, 1  }24  — >  {0, 1  }24  defined  as 

g{X)  =  g{A  ||  B  ||  C)  =  s(A)  ©  B'  ||  B'  ||  5(C)  ©  B'  (3.21) 

and 

B'  =  5  (5(A)  ©  B  ©  5(C)) .  (3.22) 

The  transformation 

T(X)  =  T(A  ||  B  ||  C)  =  A  ©  B  ||  B  ||  B  ©  C  (3.23) 

must  also  be  applied  before  the  first  and  after  the  last  round  to  ensure  the  rewritten  algorithm 
is  equivalent  to  the  original  definition.  It  follows  from  the  definition  of  g  that  it  is  bijective, 
provided  that  s  is  bijective.  The  SoDark  algorithm  with  r  rounds  can  now  be  expressed  as 

EK(P)  =  T(g(g(g(g(T(P)®K1)®K2)  ...  ®  Kr.{)  ®  Kr))  (3.24) 

where  K,  =  k\  ||  k2  ||  ko  with  values  from  in  Table  3.2.  Figure  3.8  shows  the  algorithm 
expressed  in  this  manner.  Decryption  is  identical  to  encryption  with  g~l  in  place  of  g  and 
the  round  keys  applied  in  reverse  order.  A  representation  of  SoDark-6  can  be  derived  in 
the  same  manner. 


K\  K2  Kr- 1  Kr 

\  \  \  \ 

p  — ►IZh  ®  ®  •  •  •  -►  ®  ®  -KlHi]— ►  c 

Figure  3.8.  An  r-round  iterated  Even-Mansour  construction  with  round 
function  G  and  initial  and  final  transformations  T. 
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From  Equation  3.24,  it  now  clear  that  the  algorithm  is  equivalent  to  the  iterated  EM 
construction  [30],  with  g  as  the  random  permutation  function  and  the  transformation  T 
applied  to  the  plaintext  and  ciphertext.  The  applications  of  T  and  the  last  application  of  g 
provide  no  additional  security  as  their  inverses  are  known. 

3.7  Properties  with  Respect  to  Linear  Cryptanalysis 

Since  the  8-bit  S-box  had  a  number  of  linear  combinations  of  input  and  output  bits  with  high 
bias,  the  assumption  was  made  that  the  prevalence  of  high-bias  linearities  would  remain  in 
the  transformation  into  a  24-bit  S-box.  It  is  not  feasible  to  generate  the  full  LAT  for  a  24-bit 
S-box,  since  this  process  has  very  high  time  and  memory  complexities.  For  that  reason, 
only  a  part  of  the  set  of  possible  linearizations  has  been  searched. 

Initially,  all  combinations  of  one,  two,  three,  and  four  input  and  output  bits  were  searched 
to  find  good  linearizations.  This  yielded  a  number  of  linearizations  with  significant  bias — 
some  over  10%.  The  best  linearizations  found  using  this  method  are  presented  in  Table  3.4. 

In  order  to  find  more  high-bias  linearizations  of  the  24-bit  S-box,  a  heuristic  search  algorithm 
was  used.  Different  combinations  of  high-linearity  input  masks  for  the  8-bit  S-box  and  their 
corresponding  output  masks  were  tried  on  the  24-bit  S-box.  The  results  of  this  were 
surprisingly  good:  Linearizations  with  up  to  14.8%  bias  were  found.  The  best  known 
linearizations  for  the  24-bit  S-box  are  presented  in  Table  3.5. 

In  all,  111  linearizations  with  a  bias  of  more  than  10%  have  been  found. 

Using  a  branch  and  bound  algorithm,  combinations  of  the  S-box  linearizations  that  approx¬ 
imate  five  rounds  of  the  cipher  were  found,  i.e.,  the  number  of  rounds  needed  for  an  attack 
of  the  eight-round  variant  used  in  2G  ALE.  The  biases  of  those  linearizations  are  so  low 
that  even  if  given  all  224  theoretically  possible  plaintext  messages  and  their  corresponding 
ciphertexts,  the  probability  of  recovering  key  bits  faster  than  brute  force  is  prohibitively 
high. 
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Table  3.4.  Linearizations  of  the  24-bit  SoDark  S-box  found  by  searching 
all  one-,  two-,  three-,  and  four-bit  combinations. 


Input  mask 

Output  mask 

Bias 

00006® 

002222 

-10.94% 

600000 

222200 

-10.94% 

0000C8 

00A0A0 

-10.94% 

C80000 

A0A000 

-10.94% 

00009A 

000202 

-10.94% 

9A0000 

202000 

-10.94% 

Table  3.5.  Best  known  linearizations  for  the  24-bit  SoDark  algorithm  S- 
box. 


Input  mask 

Output  mask 

Bias 

000073 

007777 

14.8% 

730000 

777700 

14.8% 

000024 

009191 

14.1% 

240000 

919100 

14.1% 

ooooco 

001515 

-14.1% 

cooooo 

151500 

-14.1% 
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CHAPTER  4: 
Structural  Attacks 


4.1  Measures  of  Complexity 

The  efficiency  of  a  cryptographic  attack  is  measured  by  its  complexity.  It  provides  a 
means  of  relating  the  speed  of  the  attack  to  that  of  a  brute  force  approach,  or  to  compare 
different  attacks  with  each  other.  An  attack’s  complexity  can  be  stated  for  its  time,  data,  and 
memory  requirements.  Time  complexity  specifies  how  many  operations  of  some  kind  that 
the  attack  requires  on  average.  It  is  normally  the  most  important  complexity  considered. 
Data  complexity  specifies  the  amount  of  data  needed  in  the  form  of  plaintext-ciphertext 
pairs,  or  the  like,  to  perform  the  attack.  Lastly,  an  attack’s  memory  complexity  describes 
the  amount  of  memory  that  it  needs  to  run. 

For  the  attacks  presented  in  this  and  following  chapters,  complexities  are  stated  in  expo¬ 
nential  notation.  In  the  case  of  an  attack  on  r  rounds,  the  unit  used  to  describe  the  time 
complexity  is  the  number  of  r-round  encryptions  that  would  take  the  same  time  to  perform. 
As  an  example,  a  brute  force  attack  on  SoDark — which  uses  56-bit  keys — is  expected  to 
have  a  complexity  of  255  on  average. 

The  speed  of  SoDark  implementations  is  almost  entirely  dependent  on  the  number  of  S-box 
operations  performed.  The  number  and  speed  of  all  other  operations  required  in  encryption, 
decryption,  and  attacks  are  negligible  in  comparison.  For  that  reason,  the  number  of  S-box 
operations  required  to  test  one  key  in  a  brute  force  attack  will  be  used  to  calculate  the 
relative  time  complexity  of  other  attacks.  Testing  one  key  in  a  r-round  brute  force  attack 
requires  3  •  (r  -  1)  S-box  operations. 

For  data  complexities,  the  unit  used  is  the  number  of  known  ciphertexts,  plaintext-ciphertext 
pairs,  or  plaintext-ciphertext-tweak  tuples  that  are  needed  to  perform  the  attack  with  some 
specified  probability  of  success. 

For  memory  complexities,  the  unit  is  the  cipher  block  size. 
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For  time  complexities  in  particular,  the  actual  complexity  required  to  perform  an  attack  may 
vary.  This  is  both  because  of  the  “luck”  aspect — the  correct  key  could  be  among  the  first 
or  last  tried — but  also  because  the  number  of  matching  differentials  in  some  intermediate 
state  of  the  cipher  may  be  unusually  high  or  low.  In  the  calculations  of  time  complexities, 
the  average  number  of  one-byte  keys  implying  a  specific  one-byte  target  differential  was 
used.  This  number  is  believed  to  result  in  the  best  estimates  of  average-case  performance. 
It  was  calculated  from  the  SoDark  S-box  DDT  as  =  ||||  ~  2.6.  Over  the  set  of  all 
possible  one-byte  keys,  the  average  number  of  possible  output  differentials  for  a  given  input 
differential  to  the  S-box  is  2^44  ~  100.  Both  these  averages  exclude  the  zero  differential, 
which  only  implies  itself. 

4.2  Attacks  on  Iterated  Even-Mansour  Constructions 

It  is  immediately  apparent  that  one  round  of  SoDark  provides  no  security  at  all  since, 
given  one  plaintext-ciphertext-tweak  tuple  (P,  C,  T),  the  key  can  be  recovered  by  'K  - 
g~l(C )  ©  P  ©  T .  Two  rounds  of  the  algorithm  is  equivalent  to  the  original  one-round  EM 
construction  described  in  [30]. 

Known  and  chosen  plaintext  attacks  on  the  EM  construction  corresponding  to  the  lower 
bound  proven  by  Even  and  Mansour  in  [30]  are  presented  by  Daemen  in  [31].  For  the 
case  of  two  independent  subkeys  of  size  n.  Daemen  shows  a  known-plaintext  attack  with  an 
average  time  complexity  proportional  to  2"_1  and  a  chosen  plaintext  attack  with  complexity 
proportional  to  2t  Both  of  these  are  significantly  faster  than  the  22n  complexity  of  a  brute 
force  attack.  Thus,  the  two-round  SoDark-3  algorithm  provides  at  most  12  bits  of  security 
in  regard  to  this  attack.  Further  insight  is  given  by  [32],  which  shows  that  independent 
subkeys  in  the  single-round  EM  construction  provide  no  added  security  compared  to  a 
construction  with  identical  subkeys. 

Attacks  on  various  iterated  versions  of  the  EM  construction  are  presented  in  [33],  [34], 
and  [35].  Notably,  [34]  demonstrates  that,  for  <  4  rounds  with  two  independent  keys  used 
in  any  order  throughout  the  rounds,  the  time  complexity  for  recovering  the  keys  is  at  most 
proportional  to  2".  (An  r-round  iterated  EM  construction  uses  r  +  1  keys.) 
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Generalizing,  [36]  shows  that  the  an  r-round  iterated  EM  construction  with  independent 
round  keys  has  an  upper  security  bound  of  r  ■  2^  queries  to  an  oracle,  where  n  is  the  block 
size.  The  article  also  shows  an  attack  for  an  r-round  iterated  EM  construction  with  time 
complexity  proportional  to  2™. 

While  attacks  on  the  SoDark  cipher  that  consider  it  as  an  EM  construction  are  directly  ap¬ 
plicable,  they  are  suboptimal  because  they  regard  the  24-bit  S-box  as  a  random  permutation. 
In  reality,  it  is  a  combination  of  three  8-bit  S-boxes  (see  Equations  3.21  and  3.22).  This 
structure  can  be  used  to  mount  the  more  efficient  attacks  described  in  the  following  sections. 

4.3  Known-Plaintext  Attack  on  Two-Round  SoDark-3 


The  calculations  for  two  rounds  of  encryption  using  SoDark-3  are: 

A(1)  =  s(a(0)  ®B(0)  ©  ki  ®fi)  (4.1) 

C(1)  =  5  (c(0)  ©  B(0)  ©  k2  0  f2)  (4.2) 

£(i)  =  JB( 0)  e  A(i)  e  c(t)  ©  ©  r3J  (4.3) 

A(2)  =  5 (a(1)  ©  B(l)  ©  k4  ©  r4)  (4.4) 

C(2)  =  s(c(1)  ©  B(l)  ©  k5  ©  r5)  (4.5) 

B(2)  =  s(b{1)  ©  A(2)  ©  C(2)  ©  k6  ©  .  (4.6) 

Since  the  inverse  W1  and  tweak  is  known 

B{2)  =  5_1  (fi(2))  ©  A(2)  ©  C(2)  ©  t6  =  B(l)  ©  k6  (4.7) 

C(2)  =  5_1  (c(2))  ©  t5  =  C(1)  ©  B(l)  ©  k5  (4.8) 

A(2)  =  5_1  (a(2))  ©  t4  =  A(1)  ©  Bm  ©  k4  (4.9) 

can  be  calculated.  From  Equations  4.1  through  4.6,  it  is  also  evident  that 

A(2)  =  s(a(0)  ©  fi(0)  ©  ki  ©  t\ )  ®s[b(0)  ©  A(1)  ©  C(1)  ©  k3  ©  t3J  ©  k4  (4.10) 
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and 


C(2)  =  s(c(0)  ©  B(0)  ®  k2  ©  f2)  ©  s(s(0)  ©  A(1)  ©  C(1)  ©  k3  ©  f3)  ©  k5.  (4.11) 

Now,  given  two  plaintext-ciphertext-tweak  tuples,  the  differentials  AA(1\  AC(  l\  and 
can  be  calculated: 

AA(1)  =  Aj2)  ©  a!,2)  ©  Bf]  ©  B !,2) 

=  s (a®  ©  Bf]  ©  k\  ©  fu  j  ©  ^£|0)  ©  A(j!)  ©  Cj1}  ©  k3  ©  r3>ij  © 

■s’(a(,0)  ©  B^)]  ©  ki  ©  ht 2)  ©  s(b^0)  ©  A*,1*  ©  c!,l>  ©  k3  ©  r3)2)  © 

'  “  ’  '  '  “  “  '  (4.12) 

s^B^  ©  AjJ)  ©  C|J)  ©  k3  ©  t3,ij  ©  s^B^  ©  A,1}  ©  C\l)  ©  k3  ©  f3>2) 

=  s  (a(j0)  ©  5j0)  ©  ki  ©  tuJ  ©  ,^A^0)  ©  Bf]  ®ki®  f1>2) 

-A(1)©  A(1) 

/jl  ^  \X7 

AC(1)  =  cj2)  ©  6?  ©  Bf}  ©  Bf 

=  s(cj0)  ©  5j0)  ©  k2  ©  t2, 1  j  ©  ©  A(j1}  ©  Cj(1)  ©  k3  ©  r3>i  j  © 

s fc^0)  ©  B^)]  ©  k2  ©  t2,2)  ©  s(Bo0)  ©  ©  c\l)  ©  k3  ©  f3,2)  © 

v  ’  ’  v  ’  (4.13) 

©  Aj1}  ©  Cj(1)  ©  k3  ©  r3>ij  ©  s|b$0)  ©  Ao1}  ©  co!)  ©  ^3  ©  *3,2) 

=  ^(cj0)  ©  B{f)]  ©  k2  ©  r2,l)  ©  ©  B !)0)  ©  k2  ©  t2,2) 

—  /AO  jTN  /AO 

\_x  ^  vly 

A B(0)  =  .r1  (fi!2)  ©  kg)  ©  s-1  (42)  ©  kg)  ©  AA(1)  ©  AC(1)  ©  r3j  ©  t3,2.  (4.14) 

Equations  4.12,  4.13,  and  4.14  show  that  the  candidates  for  key  bytes  ki  and  k2,  and  kg 
can  be  searched  independently  of  each  other  and  the  other  key  bytes.  Each  value  of  k\ , 
k2,  and  kg  will  imply  a  value  for  AA(1\  AC*-1),  and  A respectively,  and  those  that  do 
not  generate  the  differential  calculated  from  the  ciphertext — or  the  plaintext  in  the  case  of 
kg — can  be  immediately  discarded.  This  process  is  shown  in  Figure  4.1. 
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c(0) 

I  \ 


^4  ©  ^4  — ►  (J)  k(,  ©  @ 

A(2)| 


Figure  4.1.  Attack  on  two-round  SoDark-3  by  guessing  key  bytes  k\, 
k 2,  and  independently  and  matching  the  results  with  AA(1\  AC(1\  and 
AB((y> .  The  parts  of  the  cipher  marked  in  blue  are  known  or  can  be  calculated 
without  guessing  any  part  of  the  key. 


On  average,  2.6  candidate  values  each  for  k\,  k^,  and  ke  are  expected  as  a  result  of  this 
search.  Now,  for  each  possible  tuple  k\,  ko,  k 6,  the  values  of  £3,  k4,  k5  are  calculated.  If 
the  values  of  those  match  for  both  plaintext-ciphertext-tweak  tuples,  we  have  a  candidate 
key  that  can  be  verified  against  further  plaintext-ciphertext-tweak  tuples.  The  full  attack 
process  is  described  in  Algorithm  4.1. 
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In  calculating  the  time  complexity  of  the  attack,  we  first  note  that  28  keys  have  to  be  tested 
for  each  of  k\ ,  k 2,  and  k§.  Each  test  uses  two  S-box  operations.  This  will  yield,  on  average, 
2.63  w  17.6  candidate  k\,  k2,  tuples.  For  each  of  those,  k 3,  £4,  and  k5  are  calculated 
using  both  plaintext-ciphertext-tweak  tuples.  This  requires  six  S-box  operations  per  key 
tuple,  but  some  of  those  can  be  cached  in  between  iterations,  see  Algorithm  4.1.  Therefore, 
the  total  average  time  complexity  of  the  two-round  attack  is 

6  •  28  +  2  •  2.6  +  2  •  2.62  +  2  •  2.63  o9 
3 

Any  pair  of  plaintext-ciphertext-tweak  tuples  that  satisfy 

AA(0)  ®  Afi(0)  ®  tn  ffi  t\2  ±  0 
A C(0)  ffi  A fi(0)  ffi  t2,i  ©  t2,2  ±  0 
AA(1)  ffi  A B{0)  ffi  AC(1)  ffi  tx\  ffi  tx 2  4  0 

can  be  used  in  attack.  Since  the  number  of  tuple  pairs  that  does  not  satisfy  this  requirement 
is  quite  small,  the  attack  works  for  virtually  any  pair,  making  the  data  complexity  2.  The 
memory  complexity  is  241. 


(4.15) 


(4.16) 

(4.17) 

(4.18) 
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Algorithm  4.1  Perform  a  known-plaintext  attack  on  two-round  SoDark-3  and  print  all 
candidate  keys. 

1:  procedure  CrackTwoRounds(^i,  Ci,  77,  Pi-,  C2, 7 2) 


A5(0) 

- 

Af  - 

-}(2) 

A2  <- 

AA<!> 

(2) 


.ri(A(|  ))  ® 

5_1(^o2))  ©  k2 


©  ^6,1 
©  *6,2 


AB(V 


^ 2  r 

AC(1)  <-  Cp'  ©  C*’  ®  A B^> 

Lk j  <—  empty  list 
Lkl  <—  empty  list 
L^6  <—  empty  list 

for  all  k\  do  >  28  possible 

if  ^Af  ffi  B(f  ffi  ki  ffi  fu)  ©  s(A^0)  ffi  B(f  ©  ki  ffi  ti>2)  =  AA<!>  then 
Lki  .append(ki) 

end  if 
end  for 

for  all  k2  do  >  28  possible 

if  s(c|0)  ffi  B(f  ©  k2  ©  t2,  1)  ffi  s(cf  ffi  Bf  ffi  k2  ffi  t2, 2)  =  AC(1)  then 
Lkl.  append(  k2) 

end  if 
end  for 

for  all  ke  do  >  28  possible 

if  ,r1(#12)  ffi  k6)  ffi  s~{(B{f  ffi  k6)  ffi  AA(1)  ffi  AC(1)  ffi  t3j  ffi  r3j2  =  A£(0)  then 
L^.append(k6) 

end  if 
end  for 


y_1(q  Offits.t 
y_1(d2))  ©  r5>2 


A 


ki  ffi  tij)  ffi  s(A,0) 


>  28  possible  k\ 
if  ffi  ki  ffi  tii)  =  AA(1)  then 


B{f  ffi  k2  ffi  t% 2) 


>  28  possible  k2 
AC(1)  then 


>  28  possible  ke 
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30: 

31: 

32 

33 

34 

35 

36 

37 
38: 
39: 
40: 
41: 
42: 
43: 

44 

45 

46 

47 

48 

49 

50 

51 

52 

53 

54 


for  all  k  i  e  Lk]  do 

A(j!)  i'(A(1())  ®  Bf]  ffi  ^  ffi  fu) 

A^1}  <-  s(A(,0)  ®  B !,0)  ffi  k\  ©  ri>2) 

for  all  ko  6  Lkl  do 

c[1}  s(c|0)  ffi  Bx  ffi  k2  ffi  r2,i) 

C{2}  <-  s(cf]  ffi  B2  ffi  k2  ffi  t2,2) 

for  all  &6  6  Lkk  do 

<-  fi,2)  ©  k6 
B(2'}  <-  42)  ffi  £6 

&3,1  ffi  ffi  Cj(1)  ffi  5_1(5j1))  ffi  ?3,1 

£3,2  <-  ©  A^1}  ffi  cf  ffi  ©  *3,2 

k44  b\1)  ffi  a[!)  ffi  A(2) 

£4,2  <-  b\X)  ffi  A^1}  ffi  a!,2) 

k5, 1  <-  fij0  ffi  cj1}  ffi  Cj2) 

£5,2  <-  B(2}  ffi  cf  ffi  cj2) 

if  ks,i  =  £3,2  and  k4,\  =  £40  and  £5,1  =  £50  then 

£3  <—  &3j 
k4  < —  ^41 
^5  ^5,1 

Print(^i  II  k2  II  k3  II  k4  II  k5  ||  k6) 

end  if 
end  for 
end  for 
end  for 

end  procedure 


>  Average  2.6  ki 


►  Average  2.6  ko 


>  Average  2.6  k(, 
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4.4  Known-Plaintext  Attack  on  Three-Round  SoDark-3 

The  attack  on  three-round  SoDark-3  is  a  direct  extension  of  the  two-round  attack  described 
in  the  previous  section.  The  encryption  process  is  analogous  to  the  one  shown  in  Equa¬ 
tions  4.1  through  4.6  and  the  last  round  can  be  partially  reversed  to  calculate  A(3\  B[y\  and 
using  the  method  shown  in  Equations  4.7  through  4.9. 

The  attack  is  shown  graphically  in  Figure  4.2.  It  uses  the  fact  that  two  of  the  three  bytes 
in  the  first  and  last  round  keys  are  identical  to  perform  partial  differential  matching  in  the 
middle  round. 

First,  by  guessing  key  byte  k 2,  AC1 1 1  can  be  calculated  from  the  plaintext  as 

cf  =  5  (cf  ©  B{f  ©  k2  ©  t%  1)  (4.19) 

cf  =  5  (cf ’  ©  Bf  ®k2®  A  2)  (4.20) 

AC(1)  =  cf  ©  cf  (4.21) 


and,  by  calculating  AA(2)  and  AC(2-1  in  the  same  way  as  in  Equations  4.12  and  4.13,  A B([) 
can  be  calculated  from  the  ciphertext  as 

A Bw  =  A A(2)  ©  AC(2)  ©  s~'  (#f  ©  k2)  ©  5_1  (fif  ©  k2)  ©  r6,i  ©  t6,2.  (4.22) 

Now,  the  value  of  AC*2-1  can  be  compared  with  A7?(1)  ©  AC*-2'1,  where  AC(2^  is  calculated  by 


guessing  k\  in  addition  to  k2: 

B[2)  =  Bf  ©  k2  (4.23) 

Bf  =  Bf  ©  k2  (4.24) 

Cj2)  =  Bf  ©  Cj3)  ©  kx  (4.25) 

Cf  =  Bf  ©  Cf  ©  ki  (4.26) 

AC(2)  =  5_1  (Cf})  ©  5-1  (Cf)  ©  t5t  1  ©  t5t2.  (4.27) 


If  AC1 1 1  and  AB1 1 1  ©  AC'21  are  equal,  the  k\,  k2  pair  is  a  candidate  for  those  key  bytes.  This 
is  expected  to  happen  with  probability  resulting  in  28  candidates  for  k\,  k2. 
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ki,  k~i ; 


h,  k.2 


Figure  4.2.  Attack  on  three-round  SoDark-3  by  first  guessing  key  bytes 
k\  and  £2  independently  and  matching  on  AC(1).  In  the  case  of  that  match, 
k-j  is  guessed  and  matching  on  AA(1)  is  performed.  The  parts  of  the  cipher 
marked  in  blue  are  known  or  can  be  calculated  without  guessing  any  part  of 
the  key. 
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For  each  candidate  pair  k\,  k 2,  possible  values  for  k-j  are  then  sought.  This  is  done  by 
guessing  kj  and  comparing  AA(1)  to  AA(2)  ©  A B(2): 


A(/}  =  ,s^A|0)  ©  B\0)  ©  jfcj  ©  fuJ  (4.28) 

A«  =  s(a^0)  0  Bf  ©  k\  ©  th2j  (4.29) 

AA(1)  =  A\l)  ©  A^1}  (4.30) 

A(2)  =  A(,3)  ©  B\2)  ©  k7  (4.31) 

Af  =  Af  ©  Bf  ©  k-j  (4.32) 

AA(2)  =  5_1  ^A(2))  ©  v_1  (a^)  ©  tA  1  ©  u,2 ■  (4.33) 


As  before,  if  AA(  1 1  and  A /A'  1 1  ©  A  A'21  match,  then  the  tuple  k\,  ko,  ki  is  a  candidate  for  those 
key  bytes.  For  the  same  reason,  we  expect  28  candidates  for  k\,  A2,  kj  to  remain  after  this 
step. 

Finally,  for  each  candidate  tuple  k\,  k2,  k-j,  possible  values  of  A  3  are  found  by  checking  that 
the  value  of  A4  implied  by  a  guessed  A3  is  the  same  for  both  plaintext-ciphertext-tweak 
tuples: 

k4  l  =  ^A^  ©  Cj1)  ©  B j0)  ©  k3  ©  t3>i  j  ©  s”1  |a(2)  ©  B P  ©  k7j  ©  A(j!)  ©  t4j  (4.34) 

£4,2  =  ^A^  ©  dp  ©  B !,0)  ©  k3  ©  t3,2)  ©  s-1  (a®  ©  B ^  ©  k7)  ©  A1,1  ]  ©  i4;2.  (4.35) 

Then,  the  values  of  £5  and  A  <3  can  be  calculated  from  the  values  already  known,  thus  yielding 
a  full  candidate  key: 

k5  =  s-1  (c|2))  ©  b\V)  ©  Cj1}  ©  t5A  (4.36) 

k6  =  v-1  j  ©  Af  ©  Cj2)  ©  b\{)  ©  t6,  1 .  (4.37) 

The  complete  attack  is  shown  in  Algorithm  4.2.  Calculation  of  the  time  complexity  is  done 
in  the  same  way  as  for  the  two-round  attack  with  the  help  of  the  algorithm  description: 


8 . 28  +  6  •  216 
6 


216. 


(4.38) 
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It  is  also  clear  from  Algorithm  4.2  that  no  memory  in  addition  to  registers  is  needed  to 
perform  the  attack.  Like  in  the  two-round  case,  the  data  complexity  is  2. 
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Algorithm  4.2  Perform  a  known-plaintext  attack  on  three-round  SoDark-3  and  print  all 
candidate  keys. 

1:  procedure  CrackThreeRounds(^i,  Ci,  71,  "FS,  C2,  7i) 

o.  d(3)  „-lro(3)\  /t\  a (3)  a,  ^-.(3)  _  . 


4—  )  ©  Aj  ©  Cj  ©  tij 

B f }  4-  5_1(M3))  ©  Ao3)  ©  ci3)  ®  ti,2 

A^2)  <-  fij3)  ©  s!>3) 

43)  <“  S_1(Af})  ©  *7,1 


43)  <“  S_1(A<3))  ©  ^7,2 
AA(2)  4-  Af  ©  A®  ©  A B{2) 

C|3)  4  5-1(Cj3))  ffi  t8,l 

Cf  4-  5-!(43))  ffi  i8, 2 
AC(2)  4-  C|3)  ffi  d,3)  ©  A^2) 

for  all  £2  do 

c{1}  4-  ,y(ci0)  0  Bf}  ®k2®  t2, 1) 

41}  <-  s(cf}  0  Bf]  ®k2®  t% 2) 
AC(1)  4-  c{1}  ©  41} 
b\2}  4-  fij3)  ©  k2 


>  28  possible  £2 


5?}  ©  k2 


A 5(1)  4-  AA(2)  ©  AC(2)  ©  )  ©  s~HB\  y)  ©  f6j  ©  i6o 

for  all  k\  do 

c(2)  g(2)  e  (A(3)  0  ki 

cf }  4-  42)  ©  cf }  ©  kl 

Ar(2)  .  0-1/'a-'(2)\  »  „-l//^(2)\  m  y_  ,  my _ 


>  28  possible  ki 


AC(2)  4-  ^_1(Cj2))  ©  ^(C®)  ©  i5,i  ©  i5,2 
if  AC(1)  =  A ©  AC*-2-*  then 

A(j1}  4-  .y(Aj0)  ©  fij0)  ©  k\  ©  iu) 
a5,!)  4-  ,y(A^0)  ©  Byo)  ®ki®  f1>2) 

AA«  4-  A(/}  ©  A^ 
for  all  k2  do 

A(2)  4—  a(3)  ©  B(2)  ©  k7 
A{2)  4-  A\3)  ©  B(2)  ©  kj 

. _  „-!/ /t(2)\  . 


>  True  with  probability  2 


>  28  possible  k2 


A\’  4-  ,s_l(A|  ')  ©  t4A 
A(2)  4-  5_1(A(,2))  ©  i4,2 
AA^2)  <—  A?)  ©  a!2) 


a(2)  ©  a(2) 

xx  ^  m7 
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if  AA^  =  A B^  ®  AA(2)  then  >  True  with  probability  2-8 

for  all  kj,  do  >  28  possible  k 3 

fc4,t  <-  ^A^  ®  Cj(1)  ffi  5(j0)  ffi  k3  ffi  t3,i)  ffi  A(2)  ffi  A(j!) 

<-  5(A^1}  ffi  cf  ffi  5^0)  ffi  k3  ffi  f3i2)  ffi  A,2)  ffi  A*,1' 
if  A'41  =  A'4  2  then  >  True  with  probability  2-8 

k  4  < —  £41 

<-  s_1(c{2))  ffi  B('}  ffi  c[1]  ffi  r5,t 

k6  <-  ffi  A(2)  ffi  C[2)  ffi  B^  ffi  *6,1 

Print^  II  k2  II  k3  II  k4  II  k5  II  k6  II  k-j) 

end  if 


end  for 
end  if 
end  for 
end  if 


end  for 
end  for 

end  procedure 
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4.5  Known-Plaintext  Attack  on  Four- Round  SoDark-3 

Figure  4.3  shows  the  four-round  attack.  The  basic  principle  of  partial  differential  matching 
remains  the  same.  This  time  the  sieving  is  done  using  AA(21. 


The  main  loop  of  the  attack  iterates  over  all  possible  values  of  k2  and  £3.  In  each  loop,  a  list 
that  associates  the  values  of  k 4  and  AA(2^  with  values  of  k 5,  A®,  B[1\  and  C ®  is  built  from 
the  ciphertexts  using  the  following  calculations  and  iterating  over  all  values  of  £4  and  k$: 


flf  =  C1  (flf  )  ©  A(f  ®  cf  ©  k5  ®  ?4.i  (4.39) 

flf  =  5-1  (fif )  ffi  Af  ©  cf  ©  k5  ffi  tA1  (4.40) 

Af  =  s_1  (A(4))  ffi  flf  ffi  k3  ffi  tz  1  (4.41) 

Af  =  C1  (Af)  ffi  fif  ffi  k3  ffi  tz  2  (4.42) 

cf  =  5_1  (cf )  ffi  flf  ffi  k4  ffi  Z34  (4.43) 

cf  =  5"1  (cf )  ffi  fif  ffi  k4  ffi  f3>2  (4.44) 

fif  =  C1  (fif)  ffi  Af  ffi  Cf  ffi  k2  ffi  0,i  (4.45) 

fif  =  5_1  (fif )  ffi  Af  ffi  Cf  ffi  k2  ffi  0,2  (4.46) 

A(f  =  5"1  (Af  j  ffi  fif  ffi  C,i  (4.47) 

Af  =  s-1  (Af )  ffi  fif  ffi  c 2  (4.48) 

AA(2)  =  A(f  ffi  Af  (4.49) 

Cf  =  5_1  (cf)  ffi  fif  ffi  Cl  (4.50) 

Cf  =  s"1  (cf )  ffi  fif  ffi  C2-  (4.5 1) 
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k\,  ko,  ki,  k4 


kit  ki,  ^4,  k$ 


k i,  ki,  ki,  k$ 


ki,  k?„  &4,  k5 


Figure  4.3.  Attack  on  four-round  SoDark-3  by  matching  on  AA(2).  The 
parts  of  the  cipher  marked  in  blue  are  known  or  can  be  calculated  without 
guessing  any  part  of  the  key. 
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With  the  list  built,  the  next  step  is  to  iterate  over  all  possible  values  of  k\  and  k4  and  calculate 
A  A'21  from  the  plaintexts: 


a\'}  =  s(a<0)  ©  B\0)  ©  jfcj  ©  fu)  (4.52) 

A2}  =  5(A20)  0  B<2  0  *1  0  ^1,2)  (4.53) 

c[1}  =  s(c[0)  0  B\°>  ®k2®  t2, 1)  (4.54) 

C21}  =  5 (c20)  0  B2}  0  k2  0  *2,2)  (4.55) 

b\V)  =  0  Bf}  ©  Cj1}  ©  k3  ©  f3>1)  (4.56) 

B2]  =  5(A2]  0  B20)  0  C2]  0  k3  0  ^3,2)  (4.57) 

Aj2)  =  s(a(/}  ©  b\V)  ®k4®  f4>1)  (4.58) 

a!,2)  =  v(a!,1)  ©  B\l)  ©  k4  ©  t4;2j  (4.59) 

A  A(2)  =  A(2)©a!,2).  (4.60) 


For  each  value  of  A  A'21  calculated  from  the  plaintext,  the  corresponding  entries  in  the  list 
calculated  from  the  ciphertext  are  retrieved.  Each  entry  will  contain  the  implied  value  of 


and  allow  the  calculation  of  k\,  &6,  and  k2: 

Cf}  =  5(5^  ©  c[1}  ©  k5  ©  f5i  1)  (4.61) 

C22)  =  5  (fi21}  0  C21}  0  ^5  ©  6,2)  (4.62) 

ku  =  Cj2)  ©  C[3)  (4.63) 

kip_  =  Cf  ©  Cf  (4.64) 

k6A  =  B\V)  ©  A(2)  ©  C|2)  ©  f6,i  ©  .v_1  (#2))  (4.65) 

k6p  =  ©  Af  ©  Cf  ©  t6p_  ©  v_1  (#2))  (4.66) 

k14  =  A<2)  ©  A(3)  (4.67) 

kip  =  Af  ©  Af.  (4.68) 
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Finally,  good  matches  can  be  identified  by  checking  that  k\  =  k\  \  =  k \  o ,  k(x\  =  k(xo,  and 
k~j  \  -  k~i  2-  Algorithm  4.3  shows  the  attack  process.  That  description  is  again  used  to 
calculate  the  time  complexity  of  the  attack  which  is 


2  •  28  +  4  •  224  +  8  •  232  +  4  •  2.6  •  232 
9 


(4.69) 


The  list  used  in  the  attack  requires  memory  equivalent  to  about  217  6  blocks.  The  data 
complexity  remains  2. 
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Algorithm  4.3  Perform  a  known-plaintext  attack  on  four-round  SoDark-3  and  print  all 
candidate  keys. 

1:  procedure  CrackFourRounds(^i,  C\,  71,  P2,  C2, 7 2) 

2:  2?f  «-  ©  A(4)  ©  cf  ©  t4,i 

3:  7?f  5"1  (tff  )  ©  Af  ©  cf  ©  f4  0 

4:  A(f  «-  5_1(A(4))  ©  t2,i 

5:  Af  «-  5_1(A^4))ffit2,2 

6:  cf^^cf)©^! 


q4j  <-  J-'(Cf)  ©  t3>2 

for  all  do 

cf  <-  ©  Ci  ©  k2  ©  t2,t) 

Cf  s(Bl  ©  C2  ©  ^2  ©  ^2,2) 

for  all  k3  do 

L  <—  empty  list 

for  all  k4,  A'5  do 

q3)  <-  b f  ©  k5 


A(f  ©  b\3)  ©  k3 
Af  ©  B(3)  ©  k3 
cf  ©  Bf}  ©  k4 
cf  ©  b{3)  ©  u 


>  28  possible  k3 


>  28  possible  k3 
>  Indexed  by  k4,  AA(21 
>  216  possible  k4,  k5 


cq  <-  cf  ©  sq  ©  k4 
flf  <-  s-\Bf} )  ©  A?>  ©  cf 
q2)  <-  5_1(^23))  ©  Af  ©  cf 
Af  <-  5_1(A(3))  ©  #f  ©  f7>1 
Af  <-  5_i(Af)  ©  fif  ©  t7>  2 
cf  <-  s_1(cf)  ©  sf  ©  f8jl 

q3)  <-  5_i(cf )  ©  q2)  ©  r8>2 

AA(2)  <-  Af  ©  Af 
L.append(k4,  AA(2\  £5,  Af ,  A[ 


©  ^2  ©  tlj 

©  k2  ©  tl, 2 


end  for 
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for  all  k]  do 

A(f  <-  5(Aj0)  ®  ®  ki  ©  *u) 

A^1}  s(a!,0)  ©  B\0)  ®ki®  ti,2) 

b\[)  4-  s(B^  ©  A^J)  ©  Cj(1)  ©  k3  ©  ?3,i) 

B\l)  <-  5(fi!)0,  ©  a5,!)  ©  cf }  ©  A3  ©  ?3,2) 

for  all  k4  do 

A(2)  4-  5(A(11)  ©  ©  k4  ©  *4>1) 

Af  <-  s(Af  ©  ©  £4  ©  *4,2) 

aa(2)  4-  a(2)  ©  Af 

for  all  fc5,  Af,  43_),  c[3),  Cf \  fif,  flf }  e 

cf  4-  ©  cf  ©  k5  ©  *5,l) 

cf  <-  s(flf  ©  cf }  ©  £5  ©  f5>2) 

Au  4-  cf  ©  cf 


-  28  possible 

-  28  possible  k4 


L[k4,  A A<2>]  do 

>2.6  iterations  on  average 


Au  4-  cf  ©  cf 
k\  2  <-  cf }  ©  cf } 

A6,l  <“  5_1(5f})  ©  A(f  ©  Cf  ©  flf  ©  *6,1 
A6,i  «-  ^_1(^f })  ©  Af  ©  cf  ©  flf  ©  t6;2 
A7j  <“  A(f  ©  A(f 
A7;2  *-  Af  ©  Af 
if  =  A 12  =  A 12  and  A  62  =  A(,,2  and  A  72  =  A7.2  then 
^6  <—  A  6,1 
h  4—  A72 


end  if 
end  for 
end  for 
end  for 
end  for 
end  for 

end  procedure 


k(,  4—  A 6j 

£7  ^  | 

Print^!  II  k2  II  A3  II  A4  II  £5  II  k6  II  k7) 

lif 

r 
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4.6  Known-Plaintext  Attack  on  Five-Round  SoDark-3 

The  attack  on  five-round  SoDark-3  is  structurally  simpler  than  the  previously  described  at¬ 
tacks  and  can  be  described  entirely  by  treating  the  cipher  as  an  iterated  Even-Mansour 
construction.  It  is  shown  in  Figure  4.4.  Since  the  first  two  rounds  use  key  bytes 
k\,  k 2,  k%  £4,  ks,  k(,  and  the  last  two  rounds  use  key  bytes  k\,  k%  k<\,  ks,  kg,  k-j,  sieving  can  be 
done  in  the  middle  by  comparing  differentials,  thus  bypassing  the  third  round  key  bytes.  By 
looping  over  the  common  key  bytes  k\,  £3,  £4,  £5,  kg  in  an  outer  loop,  the  memory  require¬ 
ments  are  decreased  significantly  when  compared  to  a  standard  MITM  attack.  The  attack 
is  equivalent  to  the  three-subset  MITM  attack  described  in  [37]. 


^123  ©  *123  ^456  ©  *456  hl2  ©  *781  ^345  ©  *234  ^671  ©  *567 

I  I  I  \  I 

v  0  -^£b  0  bib  0  ^0b  0  -^0b  0  c 


k i,  k2,  ks,  k4,  ks,  k(, 


k\,  ks,  k4,  ks,  k(,,  k~i 


Figure  4.4.  Attack  on  five-round  SoDark-3. 


Using  the  notation  from  Equation  3.24,  the  attack  works  by  calculating 

Iff  =  g{o{T(P l)  ©  ki23  ©  *  123,1 )  ©  ^456  ©  *456, l)  ©  *781,1  (4.70) 

v2  =  g{d{T(P2)  ©  ^123  ©  *123,2)  ©  ^456  ©  *456,2)  ©  *781,2  (4.71) 

Av  =  v\  ©  i>2  (4.72) 

for  all  possible  values  of  k\,  ks,  £3,  £4,  ks,  kb  and  storing  them  in  a  list  indexed  by  Av.  Then 
the  same  calculation  is  done  for  all  possible  values  of  k\,  ks,  ^4,  k5,  kb,  k-j 

«ff  -  Q~X  (g~l  [g~X{T{C\))  ©  kbii  ©  *567,1  j  ©  £345  ©  *234,1  j  (4.73) 

W2  =  g~X  [g~X  [g~l{T(C2))  ©  kbn  ©  *567,2  j  ©  £345  ©  *234,2  j  (4.74) 

Aw  =  w\  ©  W2-  (4.75) 
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The  value  A w  is  then  used  to  look  up  the  key  bytes  k\,  ko,  k7,  k^,  ks,  kg  in  the  list.  If  the 
common  key  bytes  k2,  k%  £4,  ks,  kg  match,  the  candidate  key  can  be  tested  against  more 
plaintext-ciphertext-tweak  tuples . 

Algorithm  4.4  implements  the  attack.  The  following  time  complexity  is  calculated  from 
that  algorithm: 

12  •  240  (28  +  28)  4Q 

- A - -  =  249.  (4.76) 

12 

The  generated  list  uses  28  blocks  of  storage  and  the  data  complexity  is  still  2. 


Algorithm  4.4  Perform  a  known-plaintext  attack  on  five -round  SoDark-3  and  print  all 
candidate  keys. 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 
19 


procedure  CrackFiveRoundsI^i,  Ci,  77,  V2,  C2, 77) 

for  all  k\,  £3,  £4,  £5,  kg  do  >  240  possible  k\,  k?,,  k 4,  ks,  kg 

L  <—  empty  list 

for  all  ko  do  >  28  possible  ko 

IT  g(g(TCPi)  ®  ki23  ©  T23,l)  ©  ^456  ©  656, l) 

l>2  g(g(T(P2)  ©  ki23  ©  623,2)  ©  ^456  ©  656,2) 


Ad  <—  V2  ©  D2 
L. append)  Ad,  ki) 

end  for 

for  all  k7  do  >  28  possible  k7 

wi  <—  g~l(g~l{g~l{T(C\))  ©  kg7i  ©  667, 1)  ©  £345  ©  634,1) 

W2  <—  g~l(g~l(g~x(T{C2))  ©  kg7i  ©  667,2)  ©  £345  ©  634,2) 

Aid  idi  ©  ID2 

for  all  k\  G  L[Aid]  do 

PRiNT(ki  ||  k2  ||  k3  ||  k4  ||  ks  ||  kg  ||  k7) 

end  for 
end  for 
end  for 

end  procedure 
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4.7  Chosen-Tweak  Attack  on  Six-  and  Seven-Round 

SoDark-3 

Structural  attacks  of  the  types  described  previously,  where  the  cipher  is  split  in  parts  that 
use  different  subsets  of  the  full  seven-byte  key,  cannot  be  extended  beyond  five  rounds. 
Nonetheless,  for  certain  combinations  of  plaintext,  ciphertext,  tweak,  and  key,  it  is  possible 
to  predict  part  of  the  internal  state  of  the  cipher  from  the  ciphertext  alone. 

For  the  six-round  attack,  consider  two  plaintext-ciphertext-tweak  tuples  where  P\  ±  P2, 
C\  =  C2  and  all  bytes  in  the  tweak  are  identical  except  for  /51  A  t5  0.  The  key  schedule  in 
Table  3.2  shows  that  this  is  possible  if  and  only  if  AA(4)  =  A ©  ts,2,  A =  0,  and 
A C*-4)  =  0  .  This  known  internal  differential  can  be  used  to  calculate  ABil]  and  AC13-1  in  the 
following  way:  First, 

A B(3)  =  AA(4)  ©  5-1  (fi(4))  ©  U1  (44))  ©  AC(4)  ©  A U.  (4.77) 

Since  B(4)  =  B(4\  AC(4>  =  0,  and  At4  =  0,  this  reduces  Equation  4.77  to 


A  B(3)  =  A  A(4)  =  A  t5. 


(4.78) 


For  the  same  reason, 


AC(3)  =  A5(3)  ©  (cj4))  ©  .V-1  (cf ©  A t3 

=  A  B{3)  =  AA(4)  =  A  t5. 


(4.79) 


This  knowledge  allows  sieving  of  possible  k\,  ko,  k%  k^,  k=,,  k^  by  calculating  AC13-1  from  the 
plaintexts.  The  process  is  illustrated  in  Figure  4.5. 

Unlike  the  previous  attacks,  which  work  on  arbitrary  message  tuples,  the  attack  on  six 
rounds  requires  a  specific  output  differential.  The  first  step  of  the  attack  is  therefore  to  find  a 
plaintext-ciphertext-tweak  tuple  that  satisfies  it.  Assuming  that  the  cipher’s  randomization 
properties  after  four  rounds  are  good,1  all  differentials  after  the  fourth  and  subsequent 
rounds  have  probability  2-24.  The  number  of  pairs  of  plaintext-ciphertext-tweak  tuples  n 

'This  is  investigated  in  [5]. 


59 


k i,  k2,  ki,, 
k4,  ks,  k(, 


Figure  4.5.  The  first  four  rounds  in  the  attacks  on  six-  and  seven-round 

SoDark-3. 
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required  for  one  of  them  to  have  the  required  output  differential  with  50%  probability  is 
therefore 


1-2 


-24 


1 

2 


log  5 


n  = 


log  (224  -  1)  -  log  (224) 


11,629,080. 


(4.80) 


Unlike  in  a  normal  birthday  attack,  the  required  pairs  of  plaintext-ciphertext-tweak  tuples 
must  be  formed  so  that  each  tuple  in  the  pair  has  a  different  tweak.  The  most  efficient 
way  to  achieve  this  in  an  oracle  model  is  to  generate  plaintext-ciphertext-tweak  tuples  for 
two  different  tweaks  with  tsj  and  all  other  tweak  bytes  identical.  This  way,  with  n 

generated  tuples  per  tweak,  n2  tuple-pairs  can  be  formed.  Thus,  only  VI  1,629,080  ~  3410  ~ 
21 1  7  tuples  are  required  for  each  tweak  in  order  to  find  the  required  output  differential  with 
50%  probability.  This  is,  in  effect,  a  version  of  the  birthday  paradox  with  two  subsets. 


Algorithm  4.5  performs  the  six-round  attack.  Since  the  filtering  step  can  be  done  without 
any  S-box  operations,  its  time  complexity  can  be  neglected.  The  only  source  of  complexity 
that  remains  is  the  calculation  of  AC(3\  which  is 


2 . 28  +  2  •  216  +  2  •  224  +  2  •  232  +  2  •  240  +  4  •  248 
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(4.81) 


No  memory  in  addition  to  registers  is  needed  for  the  attack.  Figure  4.6  shows  the  trade-off 
curve  for  the  relationship  between  the  number  of  available  tuples  and  probability  of  success. 


The  attack  is  extended  to  seven  rounds  by  the  addition  of  an  initial  filtering  step  to  find  a 
pair  with  the  correct  fourth-round  differential.  For  each  generated  pair  of  tuples,  calculate 


AA(7)  =  Aj7)  ©  A{2)  (4.82) 

A C(7)  =  C|7)  ®  C,S7).  (4.83) 

If  AA(7)  =  0  and  A C(7)  =  0,  continue  by  calculating 

B(2)  =  5_1  (flj7))  ®  A(7)  ©  c|7)  ffi  f5ti  (4.84) 

B(2)  =  s -1  [b{2) j  ffi  A?  ffi  C<7)  ffi  %2  (4.85) 

A B{6)  =  B{2)  ffi  B(2).  (4.86) 
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Figure  4.6.  Trade-off  curve  between  data  and  probability  of  success  in  the 
case  with  two  sets  of  tweaks  of  same  size. 


If  A/f6-1  =  0,  the  pair  has  the  required  fourth-round  differential  and  the  cipher  can  be 
attacked  in  the  same  way  as  the  six-round  version.  Note  that  this  filtering  step  does  not 
involve  guessing  any  key  bits. 


The  addition  of  a  filtering  step  increases  the  time  complexity  for  an  attack  with  50% 
probability  of  success  by  a  negligible  amount: 


2  •  11,629,080  +  2  •  28  +  2  •  216  +  2  •  224  +  2  •  232  +  2  •  240  +  4  •  248 

18 


(4.87) 
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Algorithm  4.5  Perform  a  chosen-tweak  attack  on  six-round  SoDark-3  and  print  all  candi¬ 
date  keys. 

Require:  C i  =  C2,  fi,i  =  ti,2>  ^2,1  =  h,\  -  *3,2,  k,i  =  h,2>  h,i  =£  *5,2  >  k,i  =  h,2,  C,\  =  C,2, 

t$,i  =  ^8,2 

1:  procedure  CrackSixRounds(‘F>i,  Ci,  71,  'P2,  C2, 7$) 

2:  Ats  =  ®  t5 ,2 

3:  for  all  kj  do 

4:  A(j!)  <-  5(Aj0)  ®  Bp  ffi  k\  ©  tn) 

5:  A^1}  5(A(,°)  ffi  B ^0)  ffi  k\  ffi  ti>2) 

6:  for  all  k,2  do 

7:  Cj1}  <-  5(C|0)  ffi  Bp  ffi  k2  ffi  ?2,1 ) 

8:  dp  <-  s(Cf 3  ffi  ffi  k2  ffi  t2, 2) 

9:  for  all  k3  do 

10:  B\l)  <-  .s,(Aj1)  ffi  5^0)  ffi  Cj(1)  ffi  k3  ffi  i3j) 

11:  Bp  <-  5(A^1}  ffi  fi$0)  ffi  c£1}  ffi  k3  ffi  t3>2) 

12:  for  all  k4  do 

13:  A{P  <-  5(A(11)  ffi  ffi  k4  ffi  tn) 

14:  A{p  5(A^1}  ffi  bP  ffi  k4  ffi  r4;2) 

15:  for  all  k5  do 

16:  C(p  <- s(Cp  ®  Bp  ®  k5  ®  t5, 1) 

17:  Cp  <-  5(C,1)  ffi  Bp  ffi  k5  ffi  f5>2) 

18:  for  all  kg  do 

19:  Bp  <-  ^(A®  ffi  fi(11)  ffi  Cp  ffi  k6  ffi  t6j) 

20:  Bp  <-  s(AP  ffi  Bp  ffi  Cp  ffi  k6  ffi  t6, 2) 

21:  c|3)  <-  5(Cj(2)  ffi  Bp  ffi  ki  ffi  i8,l) 

22:  Cp  <-  s(dp  ffi  Bp  ffi  ki  ffi  tgi) 

23:  A C(3)  <-  Cj3)  ffi  dp 

24:  if  AC®  =  At5  then 

25:  PRiNT(ki  ||  k2  ||  k3  ||  k4  ||  k5  ||  k6) 

26:  end  if 

27:  end  for 

28:  end  for 

29:  end  for 

30:  end  for 

31:  end  for 

32:  end  for 

33:  end  procedure 
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4.8  Chosen-Tweak  Attack  on  Eight-Round  SoDark-3 

The  attack  on  six  and  seven  rounds  in  the  previous  section  can  be  extended  to  an  attack 
on  eight  rounds,  i.e.,  the  full  Lattice  algorithm  from  the  2G  ALE  standard.  Unlike  the 
previous  attack,  the  required  output  differential  cannot  be  identified  with  certainty.  It  can, 
however,  be  identified  with  high  probability.  Figure  4.7  shows  the  last  two  rounds  of  the 
eight-round  SoDark-3.  The  sought  differential  after  the  fourth  round  exists  if  and  only  if 
AA(7)  =  A C(7)  =  0  and  A B(1)  =  At$.  In  that  case, 

AA(8)  =  AA(7)  ©  A B(1)  =  A B(1)  (4.88) 

AC(8)  =  A C(7)  0  A B{1)  =  A B{1)  (4.89) 

and  therefore 

AA(8)  =  AC(8)  =  A  B{1).  (4.90) 

This  differential  just  before  the  eighth  round  S-boxes  therefore  indicates  a  high  probability 
that  the  seventh  round  differential  is  the  required  one.  An  attack  that  has  50%  probability 
of  success  requires  11,629,080  plaintext-ciphertext-tweak  tuple  pairs,  see  Equation  4.80. 
The  average  number  of  candidate  pairs  remaining  after  the  filtering  step  in  that  case  is 
2~16  •  11,629,080  «  177.4  *  27  5. 

For  plaintext-ciphertext-tweak  tuples  that  satisfy  Equation  4.90,  the  assumption  is  made 
that  they  have  the  correct  fourth-round  differential  and  the  values  of  k 3  that  cause 

A B(1)  =  .r1  (s-1  ©  Aj8)  ©  Cj(8)  ©  ki  ©  f8,i)  © 

U1  (s-1  (b(^  ©  a!,8)  ©  cf}  ©  ki  ©  t8.2)  (4-91) 

=  65  J  ©  tf>,2 

can  be  calculated.  Candidate  pairs  remaining  after  the  first  filtering  step  will  satisfy  this 
relationship  with  probability  In  the  50%  probability  of  success  case,  this  will  result  in 
177.4^|  w  69.3  ~  261  remaining  candidate  pairs. 

For  each  remaining  pair,  the  values  of  k\,  ko,  h,  k^,  ks,  k^  that  give  ACl3>  =  A/5  are  searched 
for  using  the  same  method  as  in  the  previous  six-  and  seven-round  attack,  with  the  exception 
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I 

9 

5(8) 

Figure  4.7.  The  last  two  rounds  in  the  attack  on  eight-round  SoDark-3. 

that  only  the  values  of  kj,  that  satisfy  Equation  4.91  for  that  pair  are  tried.  We  expect  each 
candidate  pair  that  survived  filtering  step  two  to  have  2.6  candidate  values  for  £3  on  average. 

We  can  now  calculate  the  total  time  complexity  for  the  eight-round  attack: 

—  •  (6-  11,629,080  +  4  -  27  5  -28+ 

21  V 

/  oa  \\  (4.92) 

26  1  •  2  ■  2s  +  2  -  216  +  (2  -  224  +  2  -  2“  +  2  -  240  +  4  -  248)  *  2451 . 
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The  complexity  of  this  attack  is  lower  than  the  attacks  on  six  and  seven  rounds  presented 
in  the  previous  section.  This  is  because  the  differential  after  the  next  to  last  round — which 
is  known  with  high  probability — is  used  to  deduce  information  about  part  of  the  key.  Like 
in  the  six-  and  seven-round  attacks,  21L7  plaintext-ciphertext-tweak  tuples  are  required  to 
recover  the  key  with  50%  probability.  The  memory  requirements  also  remain  the  same.  No 
memory  in  addition  to  registers  is  required. 


4.9  Experimental  Verification 

All  attacks  described  in  this  chapter  have  been  implemented  in  the  C  programming  language 
and  verified  in  practice.  The  implementations  are  publicly  available  [38]. 
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CHAPTER  5: 

Logic  Circuit  Representations  of  the  SoDARK  S-box 


5.1  Introduction 

For  the  attacks  in  the  following  chapters,  an  efficient  logic  circuit  representation  of  the  S-box 
is  needed.  Such  a  representation  describes  the  relationship  between  the  inputs  and  outputs 
of  the  S-box  as  a  circuit  of  logic  gates.  A  logic  circuit  implementation  of  an  S-box  considers 
each  of  the  S-box  output  bits  as  a  separate  Boolean  function  of  the  same  input  variables. 
In  the  case  of  the  SoDark  S-box  with  eight  inputs  and  eight  outputs,  this  means  eight 
Boolean  functions  of  eight  input  variables.  This  is  in  contrast  to  representing  the  S-box  as, 
for  example,  the  algebraic  normal  form  (ANF)  of  the  Boolean  functions  it  implements,  or 
as  a  lookup  table. 

Since  finding  the  optimum  logic  circuit  for  a  given  S-box  is  a  NP-complete  problem  and 
is  intractable  even  for  very  small  S-boxes,  heuristic  methods  must  be  used  in  all  but  very 
special  cases.  Although  these  heuristic  methods  are  significantly  faster  than  a  brute  force 
search,  they  are  still  quite  slow  and  take  a  fair  amount  of  time  to  perform,  even  on  modern 
computers.  In  particular,  for  the  logic  circuit  representations  presented  later  that  use  3-bit 
lookup  tables  (LUT),  use  of  the  NPS  Hamming  high-performance  computing  cluster  was 
necessary. 

In  [19],  Biham  presents  an  algorithm  for  generating  a  logic  circuit  for  the  DES  S-box.  It 
breaks  down  the  truth  table  of  each  Boolean  function  into  16  functions  of  two  variables  and 
then  uses  the  remaining  four  “free”  variables  to  choose  between  those  16  functions.  Using 
this  algorithm.  Biham  generates  logic  circuit  representations  of  the  DES  S-box  that  require 
100  gates  on  average. 

It  is  important  to  note  that,  although  a  logic  circuit  with  fewer  gates  is  often  faster,  this  is 
not  always  the  case.  Which  circuit  is  faster  in  practice  depends  on  the  technology  on  which 
it  is  implemented.  In  the  case  of  hardware  implementations  in  ASICs  and  FPGAs,  latency 
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is  normally  of  great  concern.  That  means,  the  signal  paths  to  different  inputs  of  the  same 
gate  must  have  approximately  the  same  delay.  Too  large  a  delay  will  necessitate  a  lower 
clock  frequency  and  thus  a  lower  speed. 

Software  implementations  are  typically  limited  by  the  available  number  of  processor  reg¬ 
isters.  This  limits  the  number  of  gate  outputs  that  are  active  in  parallel.  If  the  number  of 
active  outputs  is  higher  than  the  number  of  available  registers,  memory  must  be  used  for 
storing  the  surplus  output  variables.  This  comes  with  a  significant  performance  cost.  Logic 
circuit  representations  used  for  generating  algebraic  systems  as  input  to  SAT  solvers  have 
similar  problems.  In  that  case,  some  Boolean  gates  result  in  CNF  representations  that  are 
much  easier  for  the  SAT  solvers  to  handle  than  others. 


5.2  K  wan’s  Algorithm 

In  [39],  Kwan  presents  an  improvement  to  Biham’s  method  from  [19].  It  works  by  suc¬ 
cessively  adding  new  gates  to  a  circuit  through  recursive  search  while  trying  all  possible 
orderings  of  input  and  output  bits.  In  this  case,  with  eight  input  and  eight  output  bits,  it 
requires  testing  8!  •  8!  combinations. 

The  recursive  algorithm  described  in  [39]  takes  an  existing  partial  gate  circuit  as  input 
together  with  a  target  truth  table,  a  “don’t  care”  mask,  and  a  list  of  input  bits  already  used. 
It  returns  a  gate  in  the  circuit  whose  truth  table  is  identical  to  the  target,  except  for  the  bit 
positions  where  the  don’t  care  mask  is  zero.  Initially,  the  gate  circuit  will  only  consist  of 
the  eight  input  bits. 

Each  invocation  of  the  algorithm  can  be  split  up  into  five  successively  more  complex 
steps  [39]: 

1 .  Check  if  there  already  is  a  gate  in  the  logic  circuit  with  the  required  output  truth  table. 
If  so,  return  that  gate. 

2.  Check  if  there  is  a  gate  with  a  truth  table  that  is  the  logic  NOT  of  the  required  output 
truth  table.  If  so,  add  a  NOT  gate  to  the  logic  circuit  and  return  it. 

3.  Try  all  combinations  of  two  gates  using  AND,  OR,  XOR,  NOT,  and  ANDNOT  gates 
and  check  if  the  resulting  output  is  equal  to  the  target  truth  table.  If  so,  add  the  gates 
and  return  the  output  gate. 
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4.  Try  all  combinations  of  three  gates  using  AND,  OR,  XOR,  NOT,  and  ANDNOT 
gates.  If  one  of  the  combinations  results  in  the  required  output  table,  add  the  gates  to 
the  circuit  and  return  the  output  gate. 

5.  Split  the  truth  table  on  one  of  the  unused  input  bits  by  setting  the  corresponding  bits 
in  the  don’t  care  mask  to  zero.  Then  call  the  algorithm  twice  recursively:  once  with  a 
don’t  care  mask  corresponding  to  the  input  bit  equal  to  one  and  once  with  a  don’t  care 
mask  corresponding  to  the  input  bit  equal  to  zero.  Combine  the  output  from  the  two 
calls  with  a  two  gate  multiplexer.  Perform  this  once  for  each  of  the  remaining  unused 
input  bits  and  with  two  different  multiplexers.  Return  the  combination  of  input  bit 
and  multiplexer  that  results  in  the  logic  circuit  with  the  fewest  gates. 

The  implementation  details  of  Kwan’s  algorithm  are  somewhat  complex  and  the  reader  is 
referred  to  [39]  for  a  complete  description. 

The  complexity  of  the  algorithm  increases  for  each  of  the  five  steps.  The  first  and  second 
steps  have  complexity  0(n),  where  n  is  the  number  of  gates  in  the  partial  circuit.  In 
step  three,  this  increases  to  0(n2),  since  all  possible  combinations  of  two  gates  must  be 
considered.  For  the  same  reason,  the  complexity  of  step  four  is  0(n3).  The  most  significant 
complexity  is  in  step  five:  Due  to  the  recursion  in  combination  with  the  testing  of  all  possible 
input  bits,  this  results  in  a  complexity  of  0{b\),  where  b  is  the  number  of  unused  input  bits. 
Even  though  the  value  of  Z?!  is  manageable  in  the  case  of  the  SoDark  S-box,  where  b  <  8, 
the  big  O  notation  hides  the  high  complexity  of  each  individual  recursive  call,  which  can 
include  a  complexity  of  0(n3)  in  addition  to  the  0(b\)  term. 

Finding  the  most  efficient  logic  circuit  for  all  eight  output  functions  requires  testing  all  8 ! 
orders  of  building  those  eight  output  functions.  The  result  is  a  total  complexity  of  Kwan’s 
algorithm  of  0{b\  ■  /?!),  where  b  is  the  number  of  S-box  input  and  output  bits. 


5.3  Improvements  to  Kwan’s  Algorithm 

An  anonymous  software  project  for  building  three-bit  LUT  circuit  representations  of  S- 
boxes  is  available  as  a  GitHub  repository  [40].  It  contains  several  improvements  to  Kwan’s 
algorithm. 
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Apart  from  the  generation  of  LUT -based  logic  circuits,  the  two  major  improvements  to 
Kwan’s  algorithm  introduced  in  [40]  are  circuit  randomization  and  a  fast  feasibility  checking 
algorithm. 

The  algorithm  described  by  Kwan  is  deterministic  and  will  always  produce  the  same  output 
given  the  same  input.  Due  to  the  heuristic  nature  of  the  algorithm,  there  is  no  guarantee 
that  this  is  the  optimal  result.  By  introducing  randomization  of  the  search  order  when 
searching  for  combinations  of  gates  in  steps  one  through  four  in  the  previous  section,  we 
can  find  equivalent — and  possibly  better — gate  circuits  simply  by  running  the  algorithm 
several  times. 

The  fast  feasibility  checking  algorithm  described  in  Algorithm  5.1  significantly  improves 
the  speed  of  Kwan’s  algorithm  by  short-circuiting  parts  of  step  four  in  the  previous  section. 
It  does  this  by  performing  a  constant-time  feasibility  check  for  each  combination  of  three 
gates  before  testing  a  large  number  of  possible  ways  to  combine  them.  The  feasibility  check 
itself  is  due  to  an  interesting  observation:  Three  gates  with  arbitrary  truth  tables  can  be 
combined  to  form  an  arbitrary  target  truth  table  if  and  only  if  the  target  truth  table  can  be 
expressed  as  a  product-of-sums  expansion  of  the  three  input  truth  tables  [41] .  The  feasibility 
checking  algorithm  can  be  extended  and  applied  to  an  arbitrary  number  of  input  gates  in  a 
straightforward  way. 

When  generating  LUT-based  circuits,  additional  steps  are  added  between  steps  four  and  five 
in  Kwan’s  algorithm.  These  steps  search  for  combinations  of  three,  five,  and  seven  gates 
together  with  one,  two,  and  three  LUTs,  respectively,  to  create  the  desired  output  truth  table. 
Considering  the  large  complexities  involved  in  searching  through  all  possible  combinations 
of  five  and  seven  gates  in  the  partial  circuit,  this  would  not  be  possible  without  the  speed 
increase  provided  by  the  fast  feasibility  checking  algorithm,  especially  considering  that 
there  are  256  possible  functions  for  each  LUT. 
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Algorithm  5.1  Check  if  a  target  truth  table  can  be  produced  by  combining  three  input  truth 
tables  using  any  combination  of  gates.  Adapted  from  [40]. 

1:  procedure  CHECK3PossiBLE(target,  mask,  table  1,  table2,  table3) 

2:  match  <—  0 

3:  t\  <—  NOT  tablel 

4:  i  <-  0 

5:  for  i  <  2  do 

6:  t2  <—  NOT  table2 

7:  k  <-  0 

8:  for  k  <  2  do 

9:  t3  <—  NOT  table3 

10:  m  <—  0 

11:  for  m  <  2  do 

12:  r  <—  t\  AND  t2  AND  t3 

13:  if  (target  AND  r  AND  mask)  =  (r  AND  mask)  then 

14:  match  <—  match  OR  r 

15:  else  if  (target  AND  r  AND  mask)  t  0  then 

16:  return  false 

17:  end  if 

18:  t3  <-  NOT  t3 

19:  m  <—  m  +  1 

20:  end  for 

21:  i2<-NOTt2 

22:  k  <—  k  +  1 

23:  end  for 

24:  t\  <—  NOT  t\ 

25:  i  <—  i  +  \ 

26:  end  for 

27:  if  (target  AND  mask)  =  (match  AND  mask)  then 

28:  return  true 

29:  else 

30:  return  false 

31:  end  if 

32:  end  procedure 


71 


5.4  Software  Implementation 

For  this  research,  Kwan’s  algorithm  from  [39],  along  with  the  optimizations  and  modifi¬ 
cations  from  [40],  was  implemented  in  the  C  programming  language  [42].  The  resulting 
program  can  find  logic  circuit  representations  of  the  SoDark  S-box  that  are  suitable  for 
various  types  of  implementations  on  different  platforms.  This  includes  representations  that 
use  only  the  standard  AND,  OR,  NOT,  and  XOR  gates  as  well  as  an  option  that  also  allows 
for  ANDNOT  gates.  Circuits  can  be  built  for  a  single  output  bit  each,  or  for  any  combination 
of  output  bits. 

In  addition  to  using  the  number  of  gates  as  a  metric  when  building  the  circuits,  a  metric  that 
promotes  circuits  with  efficient  CNF  representations  is  also  available.  The  latter  is  intended 
for  generating  S-box  circuit  representations  that  have  high  performance  when  used  with 
SAT  solvers.  It  uses  the  number  of  three-variable  minterms  in  the  CNF  representation  of 
the  logic  circuit  as  a  measure  of  the  circuit’s  SAT  performance. 

Circuits  of  3-bit  LUTs  can  also  be  generated.  This  allows  fast  bitslicing  implementations  on 
Nvidia  platforms  that  implement  the  lop3  .b32  Parallel  Thread  Execution  (PTX)  instruc¬ 
tion,  as  described  in  Chapter  7.  The  logic  circuits  generated  by  the  program  can  be  output 
as  C  or  CUDA  source  code  as  well  as  in  the  Graphwiz  [43]  DOT  format  for  visualization. 

5.5  Generated  Circuits 

The  program  described  in  the  previous  section  was  used  to  generate  circuits  for  the  S-box 
that  are  suitable  for  implementations  on  general  purpose  computers,  CUDA  GPUs,  and  for 
conversion  to  CNF  for  use  with  SAT  solvers.  Despite  the  optimizations  made,  and  the  use 
of  1024  processor  cores  on  the  Hamming  high-performance  computer  cluster,  creating  a 
combined  logic  circuit  for  all  eight  Boolean  functions  using  LUTs  proved  to  be  too  large  a 
problem.  Instead,  eight  separate  circuits  were  created.  Figures  5.1  and  5.2  show  examples 
of  the  generated  circuits. 
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Figure  5.1.  Logic  circuit  representation,  with  60  gates,  of  the  Boolean  func¬ 
tion  for  output  bit  6  of  the  SoDark  S-box. 


CHAPTER  6: 
SAT-Based  Attacks 


The  SAT  problem  is  a  fundamental  problem  of  computer  science.  The  description  of  the 
problem  is  simple:  Given  a  Boolean  formula,  is  there  an  assignment  to  its  variables  for 
which  the  formula  evaluates  to  true?  If  such  an  assignment  exists,  the  formula  is  said  to  be 
satisfiable.  SAT  problems  are  normally  stated  in  CNF  form.  If  the  problem  can  be  stated  in 
a  form  where  none  of  the  minterms  in  its  CNF  expression  contains  more  than  two  variables, 
it  is  said  to  be  a  2-CNF  SAT  problem.  Solutions  for  2-CNF  SAT  problems  can  be  found  in 
polynomial  time. 

The  definition  of  the  3-CNF  SAT  problem  is  analogous  to  the  2-CNF  SAT  definition  and 
it  has  been  proven  that  all  SAT  problems  of  higher  order  are  reducible  to  an  equivalent 
3-CNF  SAT  problem.  Furthermore,  the  3-CNF  SAT  problem  is  proven  NP-complete  and  is 
among  the  most  studied  NP  problems  [44].  The  worst-case  performance  of  the  3-CNF  SAT 
is  the  same  as  for  other  MQ  problems:  0( 2an),  0  <  a  <  1.  With  modern  SAT-solvers, 
a  >  0.386  for  satisfiable  3-CNF  SAT  problems.  Problems  encountered  in  practice  can 
often  be  solved  even  faster  than  this  [15]. 

SAT  solvers  are  computer  programs  specifically  developed  for  solving  SAT  problems. 
Modern  SAT  solvers  can  solve  hard  problems  involving  thousands  of  variables  occurring 
in  a  wide  range  of  applications.  In  contrast,  naive  brute  force  methods  can  handle  only  a 
few  tens  of  variables.  The  construction  of  SAT  solving  algorithms  is  still  an  active  research 
problem  in  academia  and  many  different  heuristics  are  used.  For  that  reason,  this  research’s 
focus  has  been  on  creating  efficient  CNF  representations  while  treating  the  SAT  solvers  as 
black  boxes. 

The  problem  of  recovering  the  key  from  a  cipher  can  be  converted  into  a  SAT  problem 
by  expressing  the  entire  cipher  in  CNF.  The  logic  circuit  representations  of  the  SoDark 
S-box  created  in  Chapter  5  can  be  converted  into  CNF  by  using  the  Tseytin  transform  [45] 
whereby  the  gates  in  the  circuit  are  converted  to  equivalent  CNF  representations.  Table  6.1 
shows  CNF  representations  of  the  gates  used  in  Chapter  5. 
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Table  6.1.  Tseytin  transformations  for  some  logic  gates.  Adapted  from  [45]. 


Fogic  gate 

Operation 

Conjunctive  normal  form 

NOT 

C  =  A 

(A  V  C)  A  (A  V  C) 

AND 

C  =  A-  B 

(A  V  B  V  C)  A  (A  V  C)  A  (B  V  C) 

OR 

C  =  A  +  B 

(A  V  B  V  C)  A  (A  V  C)  A  (B  V  C) 

XOR 

C  =  A®  B 

(A  V  B  V  C)  A  (A  V  B  V  C)  A  (A  V  B  V  C)  A  (A  V  B  V  C) 

ANDNOT 

C  =  AB 

(A  V  B  V  C)  A  (A  V  C)  A  (B  V  C) 

A  C  program  that  constructs  a  problem  for  input  to  a  SAT  solver  was  created  [38].  It 
takes  three  plaintext-ciphertext-tweak  tuples  as  input  and  converts  them  to  their  respective 
implied  CNF  representations  in  the  DIMACS  format  commonly  used  by  SAT  solvers. 
Except  for  the  S-boxes,  the  cipher  consists  entirely  of  XOR  operations.  This  makes  the 
conversion  process  fairly  simple.  It  consists  of  converting  Equations  3.1,  3.2,  and  3.3  for 
each  round  into  CNF  using  the  logic  circuit  representation  from  Chapter  5  and  the  Tseytin 
transformations  of  the  operations  from  Table  6.1.  The  56  variables  representing  the  key  bits 
are  shared  among  the  parallel  cipher  representations.  The  64  tweak  bits  can  be  completely 
removed  from  the  CNF  representation  by  observing  that  the  XOR  addition  of  a  known  bit 
is  equivalent  to  the  NOT  operation  if  the  bit  is  one  and  to  doing  nothing  if  the  bit  is  zero. 

If  the  plaintext-ciphertext-tweak  tuples  are  correct,  the  constructed  SAT  problem  will  be 
satisfiable.  Due  to  the  small  block  size,  three  tuples  are  needed  to  imply  a  single  key  in  the 
case  of  SoDark-3. 

Table  6.2  shows  statistics  of  the  CNF  representations  for  various  numbers  of  rounds.  The 
representations  of  the  test  vectors  from  [5]  were  tested  with  three  different  SAT  solvers: 
CryptoMiniSat  [28],  Plingeling,  and  Treengeling  [29].  All  three  are  state-of-the- 
art  parallel  solvers  that  have  performed  well  in  the  International  SAT  Competitions  [46]. 
Plingeling  and  Treengeling  are  part  of  the  Fingeling  family  of  SAT  solvers,  while 
CryptoMiniSat  is  a  fork  of  MiniSat  [47]  optimized  for  solving  cryptological  problems. 
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Table  6.2.  CNF  representation  statistics. 


Rounds 

Clauses 

Variables 

2 

3864 

12438 

3 

7479 

24252 

4 

11094 

36066 

5 

14709 

47880 

Plingeling  and  Treengeling  were  successful  in  solving  the  SAT  problems  and  recovering 
the  key  for  up  to  four  rounds  while  CryptoMiniSat  only  managed  to  solve  the  two-  and 
three-round  SAT  problems.  For  five-round  problems,  none  of  the  solvers  could  find  solutions 
even  after  more  than  two  weeks  of  search.  Solution  times  for  each  of  the  solvers  are  plotted 
in  Figure  6.1. 
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Figure  6.1.  Performance  of  SAT  solver  attacks. 
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CHAPTER  7: 
Brute  Force  Attacks 


7.1  Introduction 

As  said  in  Chapter  2,  all  ciphers  can  be  broken  by  brute  force.  For  that  reason,  the  size 
of  the  key  space  must  be  large  enough  to  prevent  exhaustive  key  search.  With  their  small 
key  lengths  of  56  bits,  all  SoDark  variants  can  be  assumed  to  be  vulnerable  to  brute  force 
attacks  in  practice,  see  Table  2.3.  For  that  reason,  and  to  measure  the  actual  upper  bound 
of  security  for  the  algorithm,  a  brute  force  attack  was  mounted. 

An  efficient  brute  force  attack  necessitates  a  fast  implementation  of  the  cipher.  Section  2.4. 1 
discusses  some  different  approaches  to  fast  exhaustive  key  search.  From  the  investigations 
so  far,  nothing  has  been  uncovered  that  would  prevent  ASIC  attacks  from  being  successful 
with  speeds  in  the  same  order  as  the  EFF’s  ASIC -based  computer  on  DES.  Time  and 
resource  limitations  prohibit  such  an  attempt  in  this  case  and  restrict  attempts  to  commonly 
available  computer  hardware. 

7.2  The  CUDA  Framework 

The  Nvidia  CUDA  parallel  computing  framework  was  chosen  for  the  brute  force  imple¬ 
mentation.  It  is  a  GPGPU  framework  primarily  designed  for  use  with  Nvidia’s  various 
GPU  products  and  provides  a  C-like  programming  language  for  writing  programs  that  run 
on  them.  A  feature  of  recent  generations  of  Nvidia  GPUs  that  make  them  particularly 
suitable  for  brute  force  key  search  is  the  lop3  .b32  PTX  instruction.  PTX  is  the  interme¬ 
diate  assembly-like  language  used  by  the  CUDA  framework  and  its  lop3  .b32  instruction 
performs  a  bitwise  3-bit  table  lookup  [48].  This  single-instruction  bitwise  lookup  enables 
the  creation  of  bitslicing  implementations  that  are  faster  compared  to  implementations  that 
use  only  standard  bitwise  logic  instructions. 

The  execution  of  GPGPU  programs  differs  significantly  from  the  execution  of  programs 
on  normal  CPUs.  GPUs  can  have  thousands  of  cores  and  are  therefore  able  to  execute 
thousands  of  concurrent  threads.  Unlike  CPU  cores,  the  GPU  cores  execute  in  lockstep. 
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While  this  is  one  of  the  reasons  behind  the  speeds  provided  by  GPGPU  computing,  it  also 
causes  severe  performance  penalties  for  branching  instructions.  Fast  GPGPU  programs 
therefore  limit,  or  preferably  eliminate,  branch  instructions.  Performing  operations  on  the 
processor  and  GPU  in  parallel  with  copying  data  between  computer  and  GPU  memory  also 
improves  performance  by  reducing  latency  [49] . 

7.3  Brute  Force  Bitslicing  Implementation 

A  CUDA  bitslicing  implementation  of  SoDark  was  developed  for  this  thesis  [38]  using  all 
methods  described  in  the  previous  section  to  achieve  close  to  optimal  performance.  It  takes 
two  or  three  plaintext-ciphertext-tweak  tuples  as  input  and  outputs  all  matching  keys.  It 
supports  using  several  CUDA  devices  and  launches  three  parallel  CPU  threads  per  GPU 
device  in  order  to  minimize  latency. 

The  key  space  is  divided  into  224  subsets  of  232  keys  each.  All  keys  in  each  set  of  232  keys 
share  the  same  three  most  significant  key  bytes.  This  means  that  the  first  round,  which 
uses  only  those  key  bytes,  can  be  calculated  once  for  all  keys  in  the  set.  In  the  case  of  the 
Lattice  eight-round  version,  the  same  applies  to  the  last  round,  see  Table  3.2.  This  reduces 
the  number  of  rounds  that  the  bitslicing  part  of  the  implementation  has  to  perform  from 
eight  to  six.  Importantly,  only  five  rounds  of  S-box  operations  have  to  be  performed. 

With  the  guessed  states  after  the  first  and  before  the  last  round  having  been  computed  on  the 
CPU,  the  rest  of  the  key  bytes  are  tested  on  the  GPU  using  a  carefully  optimized  branch-free 
bitslicing  CUDA  implementation  of  rounds  two  through  six.  Since  the  platform  register 
size  is  32  bits,  each  kernel  iteration  tests  32  keys  in  parallel.  Instead  of  executing  branch 
instructions  on  the  GPU  to  test  for  expected  output,  the  comparison  is  done  using  bitwise 
logic  instructions  and  the  result  copied  to  GPU  memory.  After  each  kernel  is  finished 
executing  for  a  certain  subset  of  232  keys,  the  results  are  copied  from  GPU  to  computer 
memory  while  another  kernel  executes  for  the  next  subset  of  keys. 

After  copying  the  results  to  main  memory,  the  CPU  checks  for  keys  that  matched  the  first 
plaintext-ciphertext-tweak  tuple.  Matches  are  verified  against  the  second  and  third  tuples 
using  a  CPU  SoDark  implementation.  Keys  that  satisfy  all  three  tuples  are  output  as 
candidates. 
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7.4  Attack  in  Practice 

An  exhaustive  key  search  for  all  possible  keys  satisfying  the  first  two  plaintext-ciphertext- 
tweak  tuples  from  the  test  vectors  in  [5]  was  performed  using  the  implementation  described 
in  the  previous  section.  The  computer  used  had  three  Nvidia  GeForce  GTX  1070  GPUs. 
The  entire  key  space  took  14  days  to  search  through.  All  keys  matching  the  two  tuples 
are  presented  in  Table  7.1.  This  effectively  proves  that  an  exhaustive  key  search  has  been 
successfully  performed. 

7.5  Ciphertext- Only  Attack 

The  known-plaintext  brute  force  attacks  on  the  2G  and  3G  ALE  linking  protection  ciphers 
can  be  extended  to  ciphertext- only  attacks.  This  is  made  possible  by  the  stereotypical  nature 
of  ALE  linking  operations  and  PDU  format.  In  many  cases,  parts  of  the  plaintexts  can  be 
accurately  guessed  only  by  observing  encrypted  message  traffic.  For  a  normal  2G  ALE  link 
establishment  call  as  described  in  Section  2.3,  the  three  bit  preamble  of  each  plaintext  will 
be  known.  Additionally,  it  is  known  that  the  addresses  in  the  first  two  PDUs  will  be  identical. 
In  total,  this  equals  about  30  bits  of  known  information  that  can  be  used  in  a  brute  force 
attack.  More  collected  ciphertext-tweak  pairs  will  be  needed  than  for  the  corresponding 
known-plaintext  attack,  in  order  to  reduce  the  set  of  candidate  keys  to  a  manageable  number. 

Ciphertext-only  attacks  become  easier  with  3G  ALE  PDUs.  This  is  due  to  the  fact  that 
only  24  bits  of  the  26-bit  PDUs  are  encrypted.  The  two  unencrypted  bits,  together  with 
observations  on  the  encrypted  traffic,  give  information  about  the  structure  of  the  plaintext 
allowing  bits  to  be  guessed.  Furthermore,  the  inclusion  of  a  CRC  value  in  the  plaintext 
allows  for  easy  validity  checking  of  plaintexts  during  the  attack. 
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Table  7.1.  All  218  keys  that  satisfy  the  first  two  Lattice  test 
vectors  ([54E0CD,  C0D705,  543BD88000017550]  and  [54E0CD,  708434, 
543BD88040017550]),  obtained  through  exhaustive  search. 
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CHAPTER  8: 
Conclusion 


8.1  Summary  of  Attacks 

Table  8.1  summarizes  the  attacks  on  SoDark-3  presented  in  this  thesis.  For  five  rounds 
and  fewer,  key  recovery  attacks  are  possible,  given  two  arbitrary  plaintext-ciphertext-tweak 
tuples.  The  attacks  have  time  complexities  that  are  significantly  lower  than  for  exhaustive 
key  search.  Additionally,  key  recovery  using  SAT  solvers  is  possible  for  four  and  fewer 
rounds. 


Table  8.1.  Summary  of  attacks  on  SoDark-3.  Time  complexities  are 
weighted  to  be  proportional  to  the  brute  force  complexity  of  255  for  the 
same  number  of  rounds  (see  Section  4.1). 


Section 

Type 

Rounds 

Time 

Data 

Memory 

4.3 

Known-plaintext  structural 

2 

2y 

2 

22n 

4.4 

Known-plaintext  structural 

3 

216 

2 

- 

4.5 

Known-plaintext  structural 

4 

232.9 

2 

217-6 

4.6 

Known-plaintext  structural 

5 

249 

2 

28 

4.7 

Chosen-tweak  structural 

6 

246-1 

2n.i 

- 

4.7 

Chosen-tweak  structural 

7 

246-1 

212-7 

- 

4.8 

Chosen-tweak  structural 

8 

245  1 

212-7 

- 

6 

Known-plaintext  SAT-based 

<  4 

Low 

3 

Low 

7.4 

Known-plaintext  brute  force 

★ 

255 

2 

- 

7.5 

Ciphertext-only  brute  force 

★ 

255 

>  2 

- 

Attacks  on  six,  seven,  and  eight  rounds  also  exist  with  low  time  complexities.  Their  data 
complexities  are  manageable,  but  the  requirements  on  relationships  between  tweaks  make 
the  attack  hard  to  implement  by  a  passive  attacker.  Referring  back  to  Section  4.7,  the 
attack  requires  all  bytes  in  the  tweaks  in  a  pair  of  plaintext-ciphertext-tweak  tuples  to  be 
identical,  except  for  the  fifth  tweak  byte.  Considering  the  description  of  the  ALE  protocol 
in  Section  2.3,  this  may  indeed  be  possible  to  arrange  for  an  attacker  that,  for  example,  has 
come  in  possession  of  a  keyed  ALE  radio. 
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It  should  be  noted  that,  if  tweaks  are  generated  in  accordance  with  the  specifications  in  [4], 
the  fifth  byte  contains  the  word  number,  see  Table  2.2.  It  will  be  the  only  byte  that  changes 
between  PDUs  in  a  single  linking  transmission.  There  is  therefore  a  small  chance  that  the 
output  differentials  required  for  the  six-,  seven-,  and  eight-round  attacks  will  occur  during 
normal  operation.  For  ALE  networks  that  use  AL-1,  this  probability  will  be  higher  than  for 
networks  using  AL-2,  due  to  the  longer  PI. 


In  a  normal  three-PDU  linking  transmission,  the  PDUs  form  three  different  plaintext- 
ciphertext-tweak  tuple-pairs,  all  with  the  required  input  differential.  From  Equation  4.80, 
the  required  number  of  intercepted  linking  transmissions  required  to  find  the  correct  output 
differential  with  50%  probability  is  therefore 


11,629,080 

3 


3,876,360. 


(8.1) 


To  put  the  number  in  perspective,  it  is  equivalent  to  intercepting  a  linking  transmission  every 
eight  seconds  for  a  year.  This  is  obviously  not  a  realistic  setting,  except  for  possibly  in  some 
very  high  intensity  military  operations.  It  should  be  considered,  however,  that  given  the  high 
proliferation  of  ALE  technology  and  considering  all  messages  by  all  users  worldwide,  there 
is  certainly  a  non-negligible  probability  of  the  output  differential  appearing  somewhere 
within  some  sufficiently  large  time  interval. 


The  demonstrated  feasibility  of  brute  force  attacks  on  the  SoDark  ciphers,  regardless  of  the 
number  of  rounds,  shows  that  the  level  of  protection  provided  by  ALE  linking  protection 
is  not  sufficient.  This  is  in  agreement  with  the  key  length  recommendations  presented  in 
Chapter  2. 


8.2  Discussion 

“Anyone,  from  the  most  clueless  amateur  to  the  best  cryptographer,  can  create 
an  algorithm  that  he  himself  can’t  break.  It’s  not  even  hard.  What  is  hard  is 
creating  an  algorithm  that  no  one  else  can  break,  even  after  years  of  analysis.  ” 

—  Bruce  Schneier  [50] 
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A  fundamental  maxim  in  cryptography  is  that  one  should  not  use  proprietary  or  “home¬ 
made”  cipher  algorithms  in  any  setting  that  requires  real  security.  The  pitfalls  in  cipher 
construction  are  many  and  even  world-leading  experts  have  failed  in  such  efforts.  The 
accepted  best  practice  is  to  use  well-known  algorithms  that  have  been  developed  and  vetted 
thoroughly  [51].  AES  is  probably  the  best  known  example  of  a  cipher  that  satisfies  these 
requirements.  For  that  reason,  it  should  come  as  no  surprise  that  it  is  the  world’s  most  used 
cipher  algorithm. 

With  this  in  mind,  the  decision  by  the  creators  of  the  ALE  standards  to  design  their  own 
cipher  algorithm  is  unfortunate.  At  the  time  2G  ALE  was  standardized,  DES — though  also 
a  56-bit  cipher — was  well  known  and  used.  Together  with  a  suitable  block  cipher  mode  of 
operation,  it  would  have  been  a  good  candidate  in  lieu  of  Lattice/SoD ark.  In  any  case, 
with  developments  during  the  1990s  in  both  cryptanalysis  and  demonstrated  exhaustive  key 
searches  performed  by,  among  others,  the  EFF  and  Distributed.net,  a  replacement  of  the 
56-bit  linking  protection  algorithm  should  have  been  considered  at  the  time. 

The  use  of  a  tweak  in  SoDark  to  thwart  replay  attacks,  which  was  novel  for  the  time,  should 
be  noted.  Not  only  does  it  fulfill  the  requirements  of  channel-  and  time-variation  well, 
it  also  effectively  prevents  the  construction  of  TMTO  attacks  to  which  other  ciphers  with 
weak  structure  and  short  key  lengths  are  susceptible.  While  many  design  decisions  made 
in  the  construction  of  the  ALE  linking  protection  ciphers  can  be  criticized,  the  design  and 
inclusion  of  a  tweak  is  certainly  not  one  of  those. 

The  weaknesses  presented  in  the  SoDark  cipher  family  and  their  impact  on  the  ALE  system 
as  a  whole  is  a  good  example  of  how  design  flaws  in  subsystems  affect  the  design  goals 
of  the  larger  system.  In  this  case,  the  design  goals  regarding  confidentiality,  integrity, 
and  availability  in  the  ALE  system  hinge  completely  on  the  cryptographic  strength  of  the 
SoDark  algorithm. 

An  attacker  with  knowledge  of  an  ALE  linking  protection  key  can  attack  an  ALE  HF 
radio  system  in  a  number  of  ways:  First,  the  attacker  can  compromise  confidentiality  by 
recovering  encrypted  plaintexts.  This  will  include  identities  of  senders  and  receivers  as  well 
as  any  orderwire  traffic  transmitted  using  the  ALE  protocol. 
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Second,  the  adversary  can  compromise  the  integrity  of  the  network  by  injecting  arbitrary 
ALE  PDUs.  This  can  be  leveraged  to  establish  links  and  inject  higher  level  protocol  traffic. 
The  ability  to  inject  PDUs  can  also  be  used  to  geographically  locate  other  stations,  by 
causing  them  to  automatically  transmit  responses  to  received  linking  requests. 

Third,  availability  attacks  are  possible  through  PDU  injection.  For  example,  by  saturating 
an  ALE  network  with  link  establishment  calls,  an  adversary  can  tie  up  all  radio  stations  in 
the  network  with  fake  traffic,  preventing  the  transmission  of  real  traffic. 

The  synchronous  nature  of  3G  ALE  makes  it  vulnerable  to  more  attacks  by  an  adversary 
with  knowledge  of  the  linking  protection  key.  For  example,  by  transmitting  faked  replies 
to  time  synchronization  requests,  the  adversary  can  force  radios  out  of  the  network  by 
providing  deliberately  inaccurate  time  synchronization  responses. 

It  is  also  worth  emphasizing  that  ALE  linking  protection,  whether  the  cipher  is  secure  or  not, 
only  protects  the  linking  process  itself.  After  the  link  has  been  established,  it  is  handed  off 
for  use  by  higher  level  protocols.  If  those  protocols  do  not  include  protection  mechanisms 
of  their  own,  attacks  on  established  links  are  possible  without  knowledge  of  the  linking 
protection  key  through  the  use  of  normal  electronic  warfare  traffic  injection  methods. 

8.3  Recommendations 

The  ciphers  in  the  SoDark  family  should  not  be  used. 

For  short-term  mitigation,  ALE  linking  protection  users  should  change  keys  at  least  on  a 
daily  basis,  regardless  of  their  threat  model.  If  the  threat  model  includes  adversaries  that 
have  access  to  the  resources  of  medium  or  large  organizations,  keys  should  be  assumed  to 
be  recovered  within,  at  most,  hours  from  interception  of  traffic.  Appropriate  changes  in 
operating  procedures  should  be  made  to  ensure  protection  of  confidentiality,  integrity,  and 
availability  in  the  system. 

For  long-term  mitigation,  the  solution  is  to  implement  secure  replacements  for  the  SoDark 
ciphers.  Users  that  have  access  to  AL-3  and  AL-4  linking  protection  ciphers  can  use  those. 
For  users  that  do  not,  a  suggested  replacement  for  SoDark  is  outlined  in  the  next  section. 
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8.4  A  Suggested  Replacement  for  SoDark 

According  to  [52],  the  ALE  designers  are  aware  of  the  questionable  security  of  the  SoDark 
family.  For  that  reason,  they  are  considering  a  replacement  cipher  for  fourth-generation  (4G) 
ALE.  Unfortunately,  a  purpose-made  cipher,  Halfloop,  is  once  again  a  candidate,  both  to 
replace  24-  and  48-bit  SoDark  in  4G  ALE  as  well  as  in  a  96-bit  version  for  encryption  of 
the  96-bit  PDUs  introduced  in  that  standard. 

A  better  option  would  be  to  use  encryption  based  on  best  practice  methods  to  replace 
SoDark  in  2G  and  3G  ALE  and  for  linking  protection  in  4G  ALE. 

AES  is  by  far  the  most  used  and  most  trusted  cipher  algorithm  today.  It  was  created  and 
standardized  through  an  open  and  rigorous  process.  Additionally,  it  is  the  first,  and  so 
far  only,  publicly  developed  cipher  approved  by  the  NSA  for  protection  of  U.S.  classified 
information. 

With  a  block  size  of  128  bits,  AES  cannot  be  applied  as  a  drop  in  replacement.  However, 
using  block  ciphers  directly  is  unusual  in  applications.  This  is  the  purpose  of  block  cipher 
modes  of  operation.  A  mode  of  operation  that  preserves  the  format  of  the  encrypted  PDUs 
as  well  as  satisfies  the  other  requirements  on  linking  protection  is  the  Thorp  shuffle  [53].  It 
stands  on  a  sound  mathematical  foundation  and  is  backed  by  solid  reasoning  concerning  its 
security.  It  is  well  suited  for  format  preserving  encryption  of  the  small  blocks  used  in  the 
ALE  standards. 

The  Thorp  shuffle  is  a  maximally  unbalanced  Feistel  network  that  encrypts  a  single  bit  per 
round,  so  the  number  of  rounds  is  equal  to  the  block  size.  Figure  8.1  illustrates  one  round  of 
the  Thorp  shuffle.  Here,  AES  is  suggested  as  a  round  function.  The  authors  of  [53]  present 
a  method  to  avoid  calling  the  round  function  in  every  round  that  they  dub  the  5x  trick.  Using 
this  method,  the  function  only  needs  to  be  called  ["y  ]  =  5  times  in  the  24-bit  case,  |"y  ]  =  10 
times  in  the  48-bit  case,  and  |"y]  =20  times  in  the  96-bit  case.  The  number  of  passes  of 
the  Thorp  shuffle  required  for  proper  security  is  investigated  in  [53]. 

Since  n  —  1  bits  of  input  to  the  AES  round  function  are  used,  where  n  is  the  block  size,  the 
remaining  129  —  n  bits  can  be  used  to  input  a  tweak.  In  the  case  of  n  =  96,  only  33  bits  are 
available  for  tweak  use.  A  solution  to  this  could  be  to  use  an  additional  AES  encryption 
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Figure  8.1.  One  round  of  the  Thorp  shuffle  with  AES  as  the  Feistel  round 
function.  One  bit,  b  is  encrypted  into  a  bit  d  and  concatenated  with  the 
unaffected  bits  x.  Adapted  from  [53]. 


operation  to  compress  the  64-bit  tweak  and  add  the  result  to  the  input  in  some  suitable 
manner. 

One  of  the  reasons  the  Rijndael  algorithm  was  selected  for  the  AES  standard  over  the 
other  candidates  was  its  speed  on  a  variety  of  platforms,  including  on  small  8-bit  embedded 
systems  [16].  This,  together  with  the  low  number  of  PDUs  encrypted  in  any  linking 
operation,  should  make  the  speed  of  the  proposed  solution  acceptable,  even  on  embedded 
hardware  in  field  radios. 


8.5  Ideas  for  Further  Research 

Many  lines  of  effort  were  abandoned  due  to  time  constraints.  They  may  provide  further 
insight  into  the  security  of  the  SoDark  family  of  ciphers. 

Structural  attacks,  like  the  ones  described  in  Chapter  4  may  be  possible  for  more  than  eight 
rounds.  The  filtering  technique  described  in  Sections  4.7  and  4.8  that  enables  identifying 
specific  differentials  many  rounds  into  the  cipher  with  high  probability  works  on  any  number 
of  rounds. 

No  structural  attacks  were  attempted  on  SoDark-6.  The  methods  developed  for  SoDark-3 
are  likely  applicable  and  may  yield  similar  results. 
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Approaches  to  algebraic  cryptanalysis  other  than  the  one  used  in  Chapter  6  may  prove 
fruitful.  For  example,  SAT  solvers  based  on  belief  propagation  tend  to  be  very  fast  in 
solving  known  satisfiable  SAT  problems.  In  some  cases  they  are  able  to  solve  very  large 
problems  where  other  SAT  solvers  fail  [44]. 

The  algorithms  used  to  create  the  logic  circuit  representations  were  designed  for  creating 
circuits  that  are  efficient  to  implement  on  modern  CPUs.  Modification  of  the  algorithms 
so  that  they  can  find  networks  with  all  14  non-trivial  Boolean  functions  of  two  variables 
would  likely  result  in  smaller  circuits  that  are  easier  for  SAT  solvers  to  handle. 

An  extension  of  the  brute  force  solver  developed  in  Chapter  7  to  handle  the  ciphertext-only 
attacks  described  in  Section  7.5  would  provide  an  upper  bound  on  the  security  of  the  cipher 
in  best-case  conditions. 
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